Simulation speed on Vmware

Asked by fengjingyu

Hi,

I installed ubantu on vmware workstation.I allocated 32G of memory and 12 processors to the virtual machine.However, I found that yade's running speed was not qualitatively changed compared with 4G memory and 2 processors.Do I need to add some parallel code to my code?

Thanks,

Feng

Question information

Language:
English Edit question
Status:
Solved
For:
Yade Edit question
Assignee:
No assignee Edit question
Solved by:
Robert Caulk
Solved:
Last query:
Last reply:
Revision history for this message
Robert Caulk (rcaulk) said :
#1

Hello,

>> I allocated 32G of memory and 12 processors to the virtual machine.

Please provide your hardware specs too.

>>I found that yade's running speed was not qualitatively changed

Numbers would be helpful, try using O.speed and reporting back.

>> Do I need to add some parallel code to my code?

Please provide the command you use to run Yade. Yes, parallel yade requires the -jX argument (where X is the number of cores you want to use).

Best,

Robert

Revision history for this message
fengjingyu (fengjing) said :
#2

Hi Robert,

>>Please provide your hardware specs too.

The hardware specs is x64-based processor.Installed memory is 64GB.

>>Numbers would be helpful, try using O.speed and reporting back.

I use yade.timing.stats() to monitor speed.Compare input yade -j 4 a.py to yade a.py,i ran the same step size, but for almost the same amount of time.

>>parallel yade requires the -jX argument

There are two methods to parallel in programmer's manual:-j THREADS and --cores=CORES.I don't know which to choose.

Revision history for this message
Robert Caulk (rcaulk) said :
#3

>>The hardware specs is x64-based processor.Installed memory is 64GB.

How many cores does your processor have?

>>I use yade.timing.stats() to monitor speed.Compare input yade -j 4 a.py to yade a.py,i ran the same step size, but for almost the same amount of time.

Can you provide the output/numbers?

>>There are two methods to parallel in programmer's manual:-j THREADS and --cores=CORES.I don't know which to choose.

Use -jX.

>>yade -j 4 a.py

should be yade -j4 a.py (no space between -j and 4).

Revision history for this message
Jérôme Duriez (jduriez) said :
#4

Regarding Robert's #3 comment, is it true that space is not accepted between "j" and e.g. "4" ?

If I launch yadedaily -j 4 (with space), and O.run() (for an empty simulation..), I can see with htop that 4 cores are working.

htop only shows one core working with yadedaily.

Revision history for this message
Jérôme Duriez (jduriez) said :
#5

See also
yadedaily -j 4 --performance (maybe more convincing than an empty simulation..)

vs
yadedaily -j4 --performance

and
yadedaily --performance

Revision history for this message
fengjingyu (fengjing) said :
#6

>> How many cores does your processor have?

My computer has 16 processor and i allocated 12 processors to the virtual machine.

>>Can you provide the output/numbers?

After 2100 iter,yade -j4 a.py use 3441112582us and yade a.py use 3100565993us.It's strange.

My code is :

#####################
# encoding: utf-8
from yade import pack, qt, plot
import matplotlib; matplotlib.rc('axes',grid=True)
from yade import pack
import pylab
import yade.timing;
O.timingEnabled=True
yade.timing.stats()

sigmaIso=-25000

O.periodic=True

spheres=O.materials.append(FrictMat(young=64e9,poisson=0.12,density=2650,frictionAngle=0.231))
s=O.materials.append(FrictMat(young=64e9,poisson=0.12,density=2650,frictionAngle=0.24))

psdSizes,psdCumm=[0.1,0.12,0.15,0.19,0.20,0.23,0.27,0.28,0.30,0.32,0.33,0.50],[0.0001,0.003,0.03,0.11,0.25,0.43,0.70,0.85,0.95,0.97,0.98,1]

sp=pack.SpherePack()
sp.makeCloud((0,0,0),(4,4,4),psdSizes=psdSizes,psdCumm=psdCumm,distributeMass=True,periodic=True)
n=sp.toSimulation(material=spheres)

O.engines=[
   ForceResetter(),
   InsertionSortCollider([Bo1_Sphere_Aabb()]),
   InteractionLoop(
      [Ig2_Sphere_Sphere_ScGeom()],
      [Ip2_FrictMat_FrictMat_FrictPhys()],
      [Law2_ScGeom_FrictPhys_CundallStrack()]
   ),
   PeriTriaxController(label='triax',
      goal=(sigmaIso,sigmaIso,sigmaIso),stressMask=7,
      dynCell=True,maxStrainRate=(0.5,0.5,0.5),
      maxUnbalanced=1,relStressTol=0.05,
      doneHook='compactionFinished()'
   ),
   NewtonIntegrator(damping=.1),
   PyRunner(command='addPlotData()',iterPeriod=30),
]
O.dt=1*PWaveTimeStep()
O.trackEnergy=True

def particleNumber():
   print(O.bodies[-1].id)

def addPlotData():
   plot.addData(unbalanced=unbalancedForce(),i=O.iter,
      porosityy=voxelPorosity(800,(2,2,2),(3,3,3)),coordNum=avgNumInteractions()

   )

   print('p',voxelPorosity(800,(2,2,2),(3,3,3)),'s',triax.stress[0],triax.stress[1],triax.stress[2],'t',O.iter,yade.timing.stats())

pylab.semilogx(*sp.psd(bins=30,mass=True),label='Mass PSD of (free) %d random spheres'%len(sp))
pylab.legend()
pylab.show()

plot.plots={'i':('unbalanced',),'i ':('sxx','syy','szz'),' i':('exx','eyy','ezz'),' i ':('porosityy',),}
plot.plot()

qt.View()
qt.Controller()
O.saveTmp()

def compactionFinished():
   triax.maxUnbalanced=10
   triax.doneHook='triaxFinished()'

def triaxFinished():
   print 'Finished'
   O.pause()

Revision history for this message
Robert Caulk (rcaulk) said :
#7

>>My computer has 16 processor

16 physical cores and 32 threads, or 8 physical cores and 16 threads? If you have 8 physical cores and you assign 12 to the VM, you will be moving into hyperthreading territory. Not only within the Yade simulation, but your native OS will also be sharing cores with Yade :-). Generally scientific applications avoid hyperthreading like the plague (some people go to lengths to disable it entirely within the BIOS).

One more question, how many bodies are in your simulation? (use len(O.bodies)) Based on the PSD it looks like maybe you have 20k particles? I will try your MWE when my computer finishes some simulations.

Revision history for this message
Robert Caulk (rcaulk) said :
#8

I think you are right, Jérôme, it doesn't matter if there is a space between -j and the number :-)

Revision history for this message
fengjingyu (fengjing) said :
#9

>> If you have 8 physical cores and you assign 12 to the VM, you will be moving into hyperthreading territory.

I can allocate memory and processor cores within vmware. What I can be sure of is that I can allocate less processor cores than the processor cores of the computer, otherwise it will report an error. So I think I should make no mistake about using -j4 at yade.

>>how many bodies are in your simulation?

About 4k.

Revision history for this message
Best Robert Caulk (rcaulk) said :
#10

I am going to guess that you are using too few bodies to see parallelization performance increase and it seems you are hyperthreading. To see performance increase, increase the number of bodies and do not hyperthread.

Revision history for this message
fengjingyu (fengjing) said :
#11

Thanks Robert Caulk, that solved my question.