cluster run timing not improving after increasing CPU power
Dear MG-Team:
I have made two tests for my calculations (both shares identical cards)
1- SGE cluster single node with 4 CPU
2- SGE cluster about 50 CPU.
I have to say that I noticed the splitting jobs increased because at some point many CPU were available in the cluster (other users tasks finished) and my new runs passed from 500 queued pushing and running 4 jobs in a single node (case 1) to 4000 in all nodes (case 2).
The point is: How is it possible that both calculations came to have similar running time? Both calculation have 1h+30 minutes running time
comparing 1 and 2. Apparently splitting of the calculation did not improved speed, which, at least for me, sounds odd
Out of this I imagine that splitting a run in several jobs depends on cluster status (by some mean MG checks this) and it progates at some rate with calculation-
Is this a signature of configuration/
Would there be an issue with inhomogeneities on the cluster (10x4CPU machines + 12x2CPU machines)? Should i use queues of homogeneous machines?
Is it there a possibility to limit the amount of jobs ("ajob") by one run? If the split is too high loosing one job may damage the full calculation.
If splitting did not improved speed wouldn't it be better to set as many "ajobs" scripts as CPU in the cluster. So my case 2 would have created all 50 jobs submit them at once and finished quickly.
Furthermore, Iwill test the same configuration in my laptop (mac 2.4GHz intel core 2 Duo) to check the running time which I beleive I already did but not sure and the running time it was sort of the same (1h+ for a calculation)
In any case I'd like to help to improve this issue so let me know if there anything I can test in my cluster.
best regards,
Question information
- Language:
- English Edit question
- Status:
- Solved
- Assignee:
- No assignee Edit question
- Solved by:
- Arian Abrahantes
- Solved:
- Last query:
- Last reply: