Question #676451 “optimize camputational time for vibrated granu...” : Questions : Yade

Revision history for this message

Robert Caulk (rcaulk) said on 2018-11-29:

#1

Are you running in parallel [1]? LAMMPS is probably automatically using your 4 cores. Yade requires you to pass an argument containing the number of threads you wish to use.

yade -jX script.py

(X is a place holder for your number of cores, not to be taken literally.)

[1]https://yade-dem.org/doc/user.html#controlling-parallel-computation

Revision history for this message

Andrea Plati (andplat) said on 2018-11-29:

#2

Yes i'm sure, i run the script in this way:

yade -j 4 myscript.py

then i've controlled with htop command that the code is actually running
on four cores

On 11/29/18 1:37 PM, Robert Caulk wrote:
> Your question #676451 on Yade changed:
> https://answers.launchpad.net/yade/+question/676451
>
> Status: Open => Answered
>
> Robert Caulk proposed the following answer:
> Are you running in parallel [1]? LAMMPS is probably automatically using
> your 4 cores. Yade requires you to pass an argument containing the
> number of threads you wish to use.
>
> yade -jX script.py
>
> (X is a place holder for your number of cores, not to be taken
> literally.)
>
>
> [1]https://yade-dem.org/doc/user.html#controlling-parallel-computation
>

Revision history for this message

Bruno Chareyre (bruno-chareyre) said on 2018-11-29:

#3

Very interesting feedback!
It will be difficult to say something without a working script and without specific timings, though.
I thus have questions, only. :)

May I ask:
- "faster": just to be sure, you are speaking of real wall clock time per numerical time iteration, correct? Are timesteps the same?
- "more than 2000", what does it mean? 2500? one million?
- do you have an approximately linear time vs. Nparticles?
- "verletDist=rball*2", why?
- would you share more of the data?
- could you report timing.stats() [1]?
- could you show a working script?

Bruno

[1] https://yade-dem.org/doc/prog.html#timing

Revision history for this message

Andrea Plati (andplat) said on 2018-11-29:

#4

Here you find the working code: the comparison i'm talking about is
between the case of Nball=300 grains Nball=2600.

run the script as:

yade - j 4 -n script.py Nball

I misure the velocity of a simulation considering the average over time
of O.speed or the angular coefficient of the line O.realtime VS O.iter.

The verletList=2*rball has been chosen because after some attempts i
found that it makes faster the 300 ball case... but maybe there is a
better choice.

This is the timing.stats() in the case of 50000 steps in the stationary
state of the 2600 balls case:

Name Count                 Time            Rel. time
-------------------------------------------------------------------------------------------------------
ForceResetter                                     50000
2226036us                3.09%
"collider" 340             549592us                0.76%
InteractionLoop                                   50000
57352064us               79.53%
NewtonIntegrator                                  50000
11860431us               16.45%
"shaker" 50000             121798us                0.17%
PyRunner 5               1367us                0.00%
PyRunner 5                199us                0.00%
TOTAL 72111491us              100.00%

Thanks!

Andrea

SCRIPT:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

from yade import pack,ymport,export,geom,bodiesHandling
import numpy as np
import math
import sys
import random

def adder():
     if(len(O.bodies)<Nball+nBodyCont):
         sp=pack.SpherePack()
sp.makeCloud(minCorner=(-rcont/1.4,-rcont/1.4,hcone),maxCorner=(rcont/1.4,rcont/1.4,hcone+hcil),rMean=rball,num=Nball+60-len(O.bodies))
         sp.toSimulation()

#Dimensioni utili
rball=0.002
hcone=6.37*rball
rcont=22.5*rball
rconelow=4*rball
hcil=17.13*rball

Nball=int(sys.argv[1]) #Number Of Ball
A=0.00025 #amplitude shaker
fr=200 #freq shaker

#Steel
densSteel=8000
ySteel=21e7 #originale e9
poisSteel=0.293
shearModStell=ySteel/(2*(1+poisSteel))

#Plaexiglass
densPMMA=1190
yPMMA=33e6
poissPMMA=0.37

frictAngle=radians(26.57)

#Add material
#plexiglass as lammps
O.materials.append(FrictMat(young=yPMMA, poisson=poissPMMA,
frictionAngle=frictAngle,density=densPMMA, label='lmpPMMA'))

#Steel as lammps
O.materials.append(FrictMat(young=ySteel, poisson=poisSteel,
frictionAngle=frictAngle,density=densSteel, label='lmpSteel'))

#rayleigh time
tRay=math.pi*rball*(densSteel/shearModStell)**(0.5)/(0.1631*poisSteel+0.8766)

#container
coneId=O.bodies.append(geom.facetCone(Vector3(0,0,hcone/2.),rcont,rconelow,hcone,orientation=Quaternion((0,
0, 1),0),wallMask=(False*1+True*2+True*4),material='lmpPMMA'))
cyliId=O.bodies.append(geom.facetCylinder(Vector3(0,0,hcil/2.+hcone),rcont,hcil,orientation=Quaternion((0,
0, 1),0),wallMask=(True*1+False*2+True*4),material='lmpPMMA'))
contenitore=coneId+cyliId

nBodyCont=len(O.bodies) #60 facets

track=[]
for k in range(nBodyCont,len(O.bodies)):
track.append(O.bodies[k])

O.engines=[

     ForceResetter(),
     InsertionSortCollider([Bo1_Sphere_Aabb(),Bo1_Facet_Aabb(),
         ],label='collider',verletDist=rball*2),
     InteractionLoop(
         [Ig2_Sphere_Sphere_ScGeom(),Ig2_Facet_Sphere_ScGeom(),
                 ],
         [Ip2_FrictMat_FrictMat_MindlinPhys(betan=0.4, betas=0.4),
                 ],
         [Law2_ScGeom_MindlinPhys_Mindlin(),
                 ],
     ),
     NewtonIntegrator(gravity=(0,0,-9.8),damping=0.0),

]

O.engines = O.engines + [HarmonicMotionEngine(ids = contenitore, A =
(0,0,A), f = (0,0,fr), label='shaker')]

O.engines = O.engines + [PyRunner(command = "print O.iter, O.time,
O.speed, O.realtime,
kineticEnergy()",iterPeriod=10000)]+[PyRunner(command =
"adder()",iterPeriod=10000)]

O.dt=0.2*tRay
O.run(1000000)

On 11/29/18 3:07 PM, Bruno Chareyre wrote:
> Your question #676451 on Yade changed:
> https://answers.launchpad.net/yade/+question/676451
>
> Status: Open => Answered
>
> Bruno Chareyre proposed the following answer:
> Very interesting feedback!
> It will be difficult to say something without a working script and without specific timings, though.
> I thus have questions, only. :)
>
> May I ask:
> - "faster": just to be sure, you are speaking of real wall clock time per numerical time iteration, correct? Are timesteps the same?
> - "more than 2000", what does it mean? 2500? one million?
> - do you have an approximately linear time vs. Nparticles?
> - "verletDist=rball*2", why?
> - would you share more of the data?
> - could you report timing.stats() [1]?
> - could you show a working script?
>
> Bruno
>
> [1] https://yade-dem.org/doc/prog.html#timing
>

Here you find the working code: the comparison i'm talking about is 
between the case of Nball=300 grains Nball=2600.

run the script as:

yade - j 4 -n script.py Nball

I misure the velocity of a simulation considering the average over time 
of O.speed or the angular coefficient of the line O.realtime VS O.iter.

The verletList=2*rball has been chosen because after some attempts i 
found that it makes faster the 300 ball case... but maybe there is a 
better choice.

This is the timing.stats() in the case of 50000 steps in the stationary 
state of the 2600 balls case:

Name Count                 Time            Rel. time
-------------------------------------------------------------------------------------------------------
ForceResetter                                     50000 
2226036us                3.09%
"collider" 340             549592us                0.76%
InteractionLoop                                   50000 
57352064us               79.53%
NewtonIntegrator                                  50000 
11860431us               16.45%
"shaker" 50000             121798us                0.17%
PyRunner 5               1367us                0.00%
PyRunner 5                199us                0.00%
TOTAL 72111491us              100.00%

Thanks!

Andrea

SCRIPT:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

from yade import pack,ymport,export,geom,bodiesHandling
import numpy as np
import math
import sys
import random

def adder():
     if(len(O.bodies)<Nball+nBodyCont):
         sp=pack.SpherePack()
sp.makeCloud(minCorner=(-rcont/1.4,-rcont/1.4,hcone),maxCorner=(rcont/1.4,rcont/1.4,hcone+hcil),rMean=rball,num=Nball+60-len(O.bodies))
         sp.toSimulation()

#Dimensioni utili
rball=0.002
hcone=6.37*rball
rcont=22.5*rball
rconelow=4*rball
hcil=17.13*rball

Nball=int(sys.argv[1]) #Number Of Ball
A=0.00025 #amplitude shaker
fr=200 #freq shaker

#Steel
densSteel=8000
ySteel=21e7 #originale e9
poisSteel=0.293
shearModStell=ySteel/(2*(1+poisSteel))

#Plaexiglass
densPMMA=1190
yPMMA=33e6
poissPMMA=0.37

frictAngle=radians(26.57)

#Add material
#plexiglass as lammps
O.materials.append(FrictMat(young=yPMMA, poisson=poissPMMA, 
frictionAngle=frictAngle,density=densPMMA, label='lmpPMMA'))

#Steel as lammps
O.materials.append(FrictMat(young=ySteel, poisson=poisSteel, 
frictionAngle=frictAngle,density=densSteel, label='lmpSteel'))

#rayleigh time
tRay=math.pi*rball*(densSteel/shearModStell)**(0.5)/(0.1631*poisSteel+0.8766)

#container
coneId=O.bodies.append(geom.facetCone(Vector3(0,0,hcone/2.),rcont,rconelow,hcone,orientation=Quaternion((0, 
0, 1),0),wallMask=(False*1+True*2+True*4),material='lmpPMMA'))
cyliId=O.bodies.append(geom.facetCylinder(Vector3(0,0,hcil/2.+hcone),rcont,hcil,orientation=Quaternion((0, 
0, 1),0),wallMask=(True*1+False*2+True*4),material='lmpPMMA'))
contenitore=coneId+cyliId

nBodyCont=len(O.bodies) #60 facets

track=[]
for k in range(nBodyCont,len(O.bodies)):
     track.append(O.bodies[k])

O.engines=[

ForceResetter(),
     InsertionSortCollider([Bo1_Sphere_Aabb(),Bo1_Facet_Aabb(),
         ],label='collider',verletDist=rball*2),
     InteractionLoop(
         [Ig2_Sphere_Sphere_ScGeom(),Ig2_Facet_Sphere_ScGeom(),
                 ],
         [Ip2_FrictMat_FrictMat_MindlinPhys(betan=0.4, betas=0.4),
                 ],
         [Law2_ScGeom_MindlinPhys_Mindlin(),
                 ],
     ),
     NewtonIntegrator(gravity=(0,0,-9.8),damping=0.0),

]

O.engines = O.engines + [HarmonicMotionEngine(ids = contenitore, A = 
(0,0,A), f = (0,0,fr), label='shaker')]

O.engines = O.engines + [PyRunner(command = "print O.iter, O.time, 
O.speed, O.realtime, 
kineticEnergy()",iterPeriod=10000)]+[PyRunner(command = 
"adder()",iterPeriod=10000)]

O.dt=0.2*tRay
O.run(1000000)

On 11/29/18 3:07 PM, Bruno Chareyre wrote:
> Your question #676451 on Yade changed:
> https://answers.launchpad.net/yade/+question/676451
>
>      Status: Open => Answered
>
> Bruno Chareyre proposed the following answer:
> Very interesting feedback!
> It will be difficult to say something without a working script and without specific timings, though.
> I thus have questions, only. :)
>
> May I ask:
> - "faster": just to be sure, you are speaking of real wall clock  time per numerical time iteration, correct? Are timesteps the same?
> - "more than 2000", what does it mean? 2500? one million?
> - do you have an approximately linear time vs. Nparticles?
> - "verletDist=rball*2", why?
> - would you share more of the data?
> - could you report timing.stats() [1]?
> - could you show a working script?
>
> Bruno
>
> [1] https://yade-dem.org/doc/prog.html#timing
>

Revision history for this message

Bruno Chareyre (bruno-chareyre) said on 2018-11-29:

#5

Are you aware that you may not always get the number of spheres you ask for?

WARN /data/trunk/pkg/dem/SpherePack.cpp:248 makeCloud: Exceeded 1000 tries to insert non-overlapping sphere to packing. Only 1133 spheres were added, although you requested 2000.

Revision history for this message

Andrea Plati (andplat) said on 2018-11-29:

#6

Yes, but i call the function adder() (see the code) every 10000 steps
until the number of ball is the one i want

On 11/29/18 4:38 PM, Bruno Chareyre wrote:
> Your question #676451 on Yade changed:
> https://answers.launchpad.net/yade/+question/676451
>
> Bruno Chareyre posted a new comment:
> Are you aware that you may not always get the number of spheres you ask
> for?
>
> WARN /data/trunk/pkg/dem/SpherePack.cpp:248 makeCloud: Exceeded 1000
> tries to insert non-overlapping sphere to packing. Only 1133 spheres
> were added, although you requested 2000.
>

Revision history for this message

Bruno Chareyre (bruno-chareyre) said on 2018-11-29:

#7

I see.
Well, as I just reduced the radius a little to make them all fit in I can give you some results with -j4 (also reduced verletDist but the speedup is small, script below):

200 spheres: 0.52 sec for 10k iterations
2000 spheres: 5.5 sec for 10k iterations

That's nearly linear.
Unless the the cost in LAMMPS is sub-linear (how is that possible?!) I can't imagine how yade becomes 100x slower. Do you have approximately similar numbers?

Could it be that sphere insertions makes it *very* different?

Bruno

from yade import pack,ymport,export,geom,bodiesHandling,timing
import numpy as np
import math
import sys
import random

def adder():
     if(len(O.bodies)<Nball+nBodyCont):
  sp=pack.SpherePack()
  sp.makeCloud(minCorner=(-rcont/1.4,-rcont/1.4,hcone),maxCorner=(rcont/1.4,rcont/1.4,hcone+hcil),rMean=rball*0.8,num=Nball+60-len(O.bodies))
  sp.toSimulation()

#Dimensioni utili
rball=0.002
hcone=6.37*rball
rcont=22.5*rball
rconelow=4*rball
hcil=17.13*rball

Nball=int(sys.argv[1]) #Number Of Ball
A=0.00025 #amplitude shaker
fr=200 #freq shaker

#Steel
densSteel=8000
ySteel=21e7 #originale e9
poisSteel=0.293
shearModStell=ySteel/(2*(1+poisSteel))

#Plaexiglass
densPMMA=1190
yPMMA=33e6
poissPMMA=0.37

frictAngle=radians(26.57)

#Add material
#plexiglass as lammps
O.materials.append(FrictMat(young=yPMMA, poisson=poissPMMA,
frictionAngle=frictAngle,density=densPMMA, label='lmpPMMA'))

#Steel as lammps
O.materials.append(FrictMat(young=ySteel, poisson=poisSteel,
frictionAngle=frictAngle,density=densSteel, label='lmpSteel'))

#rayleigh time
tRay=math.pi*rball*(densSteel/shearModStell)**(0.5)/(0.1631*poisSteel+0.8766)

#container
coneId=O.bodies.append(geom.facetCone(Vector3(0,0,hcone/2.),rcont,rconelow,hcone,orientation=Quaternion((0,0,1),0),wallMask=(False*1+True*2+True*4),material='lmpPMMA'))
cyliId=O.bodies.append(geom.facetCylinder(Vector3(0,0,hcil/2.+hcone), rcont,hcil,orientation=Quaternion((0,0,1),0), wallMask=(True*1+False*2+True*4),material='lmpPMMA'))
contenitore=coneId+cyliId

nBodyCont=len(O.bodies) #60 facets

track=[]
for k in range(nBodyCont,len(O.bodies)):
track.append(O.bodies[k])

O.engines=[
     ForceResetter(),
     InsertionSortCollider([Bo1_Sphere_Aabb(),Bo1_Facet_Aabb(),],label='collider',verletDist=rball*0.3),
     InteractionLoop([Ig2_Sphere_Sphere_ScGeom(),Ig2_Facet_Sphere_ScGeom(),
                 ], [Ip2_FrictMat_FrictMat_MindlinPhys(betan=0.4, betas=0.4),], [Law2_ScGeom_MindlinPhys_Mindlin(),],),
     NewtonIntegrator(gravity=(0,0,-9.8),damping=0.0),
     ]

O.engines = O.engines + [HarmonicMotionEngine(ids = contenitore, A = (0,0,A), f = (0,0,fr), label='shaker')]
O.engines = O.engines + [PyRunner(command = "print O.iter, O.time,O.speed, O.realtime,kineticEnergy()",iterPeriod=10000)]+[PyRunner(command =
"adder()",iterPeriod=10000)]

O.dt=0.2*tRay
O.timingEnabled=False
O.run(20000,1)
timing.reset()
O.timingEnabled=True
O.run(10000,1)
timing.stats()

I see.
Well, as I just reduced the radius a little to make them all fit in I can give  you some results with -j4  (also reduced verletDist but the speedup is small, script below):

200 spheres: 0.52 sec for 10k iterations
2000 spheres: 5.5 sec for 10k iterations

That's nearly linear.
Unless the the cost in LAMMPS is sub-linear (how is that possible?!) I can't imagine how yade becomes 100x slower. Do you have approximately similar numbers?

Could it be that sphere insertions makes it *very* different?

Bruno

from yade import pack,ymport,export,geom,bodiesHandling,timing
import numpy as np
import math
import sys
import random

def adder():
     if(len(O.bodies)<Nball+nBodyCont):
		sp=pack.SpherePack()
		sp.makeCloud(minCorner=(-rcont/1.4,-rcont/1.4,hcone),maxCorner=(rcont/1.4,rcont/1.4,hcone+hcil),rMean=rball*0.8,num=Nball+60-len(O.bodies))
		sp.toSimulation()

#Dimensioni utili
rball=0.002
hcone=6.37*rball
rcont=22.5*rball
rconelow=4*rball
hcil=17.13*rball

Nball=int(sys.argv[1]) #Number Of Ball
A=0.00025 #amplitude shaker
fr=200 #freq shaker

#Steel
densSteel=8000
ySteel=21e7 #originale e9
poisSteel=0.293
shearModStell=ySteel/(2*(1+poisSteel))

#Plaexiglass
densPMMA=1190
yPMMA=33e6
poissPMMA=0.37

frictAngle=radians(26.57)

#Add material
#plexiglass as lammps
O.materials.append(FrictMat(young=yPMMA, poisson=poissPMMA,
frictionAngle=frictAngle,density=densPMMA, label='lmpPMMA'))

#Steel as lammps
O.materials.append(FrictMat(young=ySteel, poisson=poisSteel,
frictionAngle=frictAngle,density=densSteel, label='lmpSteel'))

#rayleigh time
tRay=math.pi*rball*(densSteel/shearModStell)**(0.5)/(0.1631*poisSteel+0.8766)

#container
coneId=O.bodies.append(geom.facetCone(Vector3(0,0,hcone/2.),rcont,rconelow,hcone,orientation=Quaternion((0,0,1),0),wallMask=(False*1+True*2+True*4),material='lmpPMMA'))
cyliId=O.bodies.append(geom.facetCylinder(Vector3(0,0,hcil/2.+hcone), rcont,hcil,orientation=Quaternion((0,0,1),0), wallMask=(True*1+False*2+True*4),material='lmpPMMA'))
contenitore=coneId+cyliId

nBodyCont=len(O.bodies) #60 facets

track=[]
for k in range(nBodyCont,len(O.bodies)):
     track.append(O.bodies[k])

O.engines=[
     ForceResetter(),
     InsertionSortCollider([Bo1_Sphere_Aabb(),Bo1_Facet_Aabb(),],label='collider',verletDist=rball*0.3),
     InteractionLoop([Ig2_Sphere_Sphere_ScGeom(),Ig2_Facet_Sphere_ScGeom(),
                 ], [Ip2_FrictMat_FrictMat_MindlinPhys(betan=0.4, betas=0.4),], [Law2_ScGeom_MindlinPhys_Mindlin(),],),
     NewtonIntegrator(gravity=(0,0,-9.8),damping=0.0),
     ]

O.engines = O.engines + [HarmonicMotionEngine(ids = contenitore, A = (0,0,A), f = (0,0,fr), label='shaker')]
O.engines = O.engines + [PyRunner(command = "print O.iter, O.time,O.speed, O.realtime,kineticEnergy()",iterPeriod=10000)]+[PyRunner(command =
"adder()",iterPeriod=10000)]

O.dt=0.2*tRay
O.timingEnabled=False
O.run(20000,1)
timing.reset()
O.timingEnabled=True
O.run(10000,1)
timing.stats()

Revision history for this message

Bruno Chareyre (bruno-chareyre) said on 2018-11-29:

#8

> 100x slower

10x even

Revision history for this message

Bruno Chareyre (bruno-chareyre) said on 2018-11-29:

#9

After rolling back to your version with sphere insertion the perfs are still more or less the same (6.7s instead of 5.5s for 2k spheres).

This is measured between iterations 20k and 30k because I lack patience to run 1e6.
Let us know if you find very different results.

Bruno

Revision history for this message

Andrea Plati (andplat) said on 2018-11-29:

#10

In order to have a good comparison we have to make the statistics once
the system has reached the stationary state after the grains have been
poured in the container (from 20k to 30k is still a transient). I give
you the results for the mean speed (Nstep/realseconds) between 50k and
500k step namely after the pouring for both cases.

meanSpeed300LAMMPS/meanSpeed300YADE=1.61

meanSpeed2600LAMMPS/meanSpeed2600YADE=6.52

the number that i gave to you in the first question were just to an idea
hoping that someone else had already done this analysis :)

but the situation is almost the same: increasing the number of grains
the ratio between the mean velocities increase in favor of LAMMPS.

In the same time interval i obtain for YADE:

meanSpeed2600YADE/meanSpeed300YADE=0.052

while 300/2600=0.12

thus YADE seems to be not linear in this kind of setup.

Andrea

On 11/29/18 5:17 PM, Bruno Chareyre wrote:
> Your question #676451 on Yade changed:
> https://answers.launchpad.net/yade/+question/676451
>
> Status: Open => Answered
>
> Bruno Chareyre proposed the following answer:
> After rolling back to your version with sphere insertion the perfs are
> still more or less the same (6.7s instead of 5.5s for 2k spheres).
>
> This is measured between iterations 20k and 30k because I lack patience to run 1e6.
> Let us know if you find very different results.
>
> Bruno
>

Revision history for this message

Bruno Chareyre (bruno-chareyre) said on 2018-11-29:

#11

It is difficult for me to understand the meaning of various values of meanSpeedLAMMPS/meanSpeedYADE since they depend on both LAMMPS and YADE. Could you give raw times?

I understand that something is transient physically but it doesn't seem to mater really in terms of speed. Why do you think so?
For instance 2000 spheres give (3rd column is speed):

20000 0.269857831466 3415.36292082 3.239 0.00195161180227
30000 0.404786747199 1349.67539933 9.901 0.00198843695584
40000 0.539715662932 1214.09029787 17.085 0.00106745354645
50000 0.674644578665 1199.2390826 24.621 0.00122919908665
...
190000 2.56364939893 1258.1333534 131.847 0.00197268858104
200000 2.69857831467 1269.76190104 139.55 0.00190037774033
210000 2.8335072304 1351.34040383 147.32 0.00169256567465
...
440000 5.93687229221 1357.22627666 326.907 0.000706448896021
450000 6.07180120794 1374.03537234 334.778 0.000549054156559
460000 6.20673012367 1322.66851309 342.658 0.000677048337406

The transient aspect of it is not obvious... I don't think it explains our different conclusions.

For 200 spheres:

10000 0.134928915733 85606.0606061 0.457 0.0
20000 0.269857831466 18590.9408551 0.9 0.00123066523288
30000 0.404786747199 16420.3612479 1.486 0.000652081911623
...
190000 2.56364939893 14819.7952972 11.095 0.00067976598448
200000 2.69857831467 16289.4412128 11.704 0.000573059522713
210000 2.8335072304 16424.248529 12.302 0.000804168697981

Still no clear trend after 20k.
The ratio of final speed is ~11 for a ratio in numbers of 10. Still close to linear.
Unless you have a sudden decrease of speed after 800k iterations I can't reproduce or explain your results.
How many cores do you have in total?
Did you try verlet=0.5*r for 2k spheres? You timings suggest that the cost of virtual interactions is excessively large and could be reduced by running the collider a bit more.

Bruno

Revision history for this message

Bruno Chareyre (bruno-chareyre) said on 2018-11-30:

#12

It is not directly related to computation time but do you know that right after adder() there is a burst of kinetic energy because you insert new spheres through the previous ones (randomly overlapping in the same volume)?

I think I realize why this non linearity appears: you increase the number of particles without changing their size hence the coordination number is increasing. 2400 spheres is when the box is maximally filled, leading to a non linear change between 2000 and 2400.
Maybe ligghts implementation of Hertz is more efficient and then this effect is less visible? I also suspect Ig2_Facet_Sphere, which generates and manipulate matrices every time even for virtual interactions [1].

This is actually a strange case to test scaling since the number of spheres cannot exceed ~2500 by definition...

Bruno

[1] https://github.com/yade/trunk/blob/2018.02/pkg/dem/Ig2_Facet_Sphere_ScGeom.cpp#L36

Revision history for this message

Andrea Plati (andplat) said on 2018-11-30:

#13

On 11/30/18 11:42 AM, Bruno Chareyre wrote:
> Your question #676451 on Yade changed:
> https://answers.launchpad.net/yade/+question/676451
>
> Bruno Chareyre proposed the following answer:
> It is not directly related to computation time but do you know that
> right after adder() there is a burst of kinetic energy because you
> insert new spheres through the previous ones (randomly overlapping in
> the same volume)?

Yes i know, and what i call transient is the initial time when the
particles are still not 2600 in my simulation. If you print also the
number of particle you see that i reach 2600 grains after x iteration.
So i want to start the comparison between 300 an 2600 after that time.
Also the burst in the kinetic energy is something that just affects the
initial time of the simulation.

> I think I realize why this non linearity appears: you increase the number of particles without changing their size hence the coordination number is increasing. 2400 spheres is when the box is maximally filled, leading to a non linear change between 2000 and 2400.
> Maybe ligghts implementation of Hertz is more efficient and then this effect is less visible? I also suspect Ig2_Facet_Sphere, which generates and manipulate matrices every time even for virtual interactions [1].
>
> This is actually a strange case to test scaling since the number of
> spheres cannot exceed ~2500 by definition...

It is surely a strange case but it is what i need for my research. I'm
studying what's happen in this setup when i increase the packing
fraction of the system and so when i increase the number of particles
(without changing the size) in the same volume. Maybe there is no reason
to obtain linearity for "CPU time VS N" in these conditions because the
number and the complexity of the interaction is increasing in a non
trivial way. Looking at the source code i've found that the
implementation of Hertz-Mindlin is a little bit different in the two
softwares and maybe this would be an element to take in account. I will
also study Ig2_Facet_Sphere.

In any case i'm doing a more systematic analysis and i will share to you
results and raw data soon

Thanks a lot for this interesting discussion!

Andrea

>
> Bruno
>
> [1]
> https://github.com/yade/trunk/blob/2018.02/pkg/dem/Ig2_Facet_Sphere_ScGeom.cpp#L36
>

On 11/30/18 11:42 AM, Bruno Chareyre wrote:
> Your question #676451 on Yade changed:
> https://answers.launchpad.net/yade/+question/676451
>
> Bruno Chareyre proposed the following answer:
> It is not directly related to computation time but do you know that
> right after adder() there is a burst of kinetic energy because you
> insert new spheres through the previous ones (randomly overlapping in
> the same volume)?

Yes i know, and what i call transient is the initial time when the 
particles are still not 2600 in my simulation. If you print also the 
number of particle you see that i reach 2600 grains after x iteration. 
So i want to start the comparison between 300 an 2600 after that time. 
Also the burst in the kinetic energy is something that just affects the 
initial time of the simulation.

> I think I realize why this non linearity appears: you increase the number of particles without changing their size hence the coordination number is increasing. 2400 spheres is when the box is maximally filled, leading to a non linear change between 2000 and 2400.
> Maybe ligghts implementation of Hertz is more efficient and then this effect is less visible? I also suspect Ig2_Facet_Sphere, which generates and manipulate matrices every time even for virtual interactions [1].
>
> This is actually a strange case to test scaling since the number of
> spheres cannot exceed ~2500 by definition...

It is surely a strange case but it is what i need for my research. I'm 
studying what's happen in this setup when i increase the packing 
fraction of the system and so when i increase the number of particles 
(without changing the size) in the same volume. Maybe there is no reason 
to obtain linearity for "CPU time VS N" in these conditions because the 
number and the complexity of the interaction is increasing in a non 
trivial way. Looking at the source code i've found that the 
implementation of Hertz-Mindlin is a little bit different in the two 
softwares and maybe this would be an element to take in account. I will 
also study Ig2_Facet_Sphere.

In any case i'm doing a more systematic analysis and i will share to you 
results and raw data soon

Thanks a lot for this interesting discussion!

Andrea

>
> Bruno
>
> [1]
> https://github.com/yade/trunk/blob/2018.02/pkg/dem/Ig2_Facet_Sphere_ScGeom.cpp#L36
>

Revision history for this message

Bruno Chareyre (bruno-chareyre) said on 2018-11-30:

#14

Welcome.
That would be interesting for me to see:

N=.. | YadeTime | LammpsTime
200
400
...
2400

I was checking Ig2_Facet_Sphere and there is certainly a way to avoid some matrix manipulation by escaping sooner when there is no contact. However I'll not hurry on that one though and it would not change you life by orders of magnitude anyway.

Besides, it is not a surprise if details of the Hertz models differ, and that would be interesting top know the consequence from a physics point of view also.

Looking forward. :)

Bruno

Revision history for this message

Andrea Plati (andplat) said on 2018-12-05:

#15

Sorry for the late but i was far away from the office.

This is my analysis for the total time involved to perform the same
simulation (both lammps and yade write on file and insert particle
almost in the same way).

#Nballl TotCPU YADE
200 23.1
400 42.9
800 85.4
1200 144.9
1600 225.7
2000 416.3

#Nballl TotCPU LAMMPS
200 15.3
400 25.1
800 36.9
1200 44.8
1600 60.1
2000 78.0

I give you the YADE script used. Run it as

time yade -j 4 -n MyScript.py Nball logfile

i use the command time that print you the total time at the end of the
execution.

Andrea

---------------------------------------------------------------------------------------------
from yade import pack,ymport,export,geom,bodiesHandling,timing
import numpy as np
import math
import sys
import random

def printator():
neff=len(O.bodies)-nBodyCont
f1.write(str(O.iter)+' '+str(O.time)+' '+str(O.speed)+'
'+str(O.realtime)+' '+str(kineticEnergy())+' '+str(neff)+'\n')

def adder():
      if(len(O.bodies)<Nball+nBodyCont):
         sp=pack.SpherePack()
sp.makeCloud(minCorner=(-rcont/1.4,-rcont/1.4,hcone),maxCorner=(rcont/1.4,rcont/1.4,hcone+hcil),rMean=rball,num=Nball+nBodyCont-len(O.bodies))
         sp.toSimulation()

#Dimensioni utili
rball=0.002
hcone=6.37*rball
rcont=22.5*rball
rconelow=4*rball
hcil=17.13*rball

Nball=int(sys.argv[1]) #Number Of Ball
A=0.00025 #amplitude shaker
fr=200 #freq shaker

#Steel
densSteel=8000
ySteel=21e7 #originale e9
poisSteel=0.293
shearModStell=ySteel/(2*(1+poisSteel))

#Plaexiglass
densPMMA=1190
yPMMA=33e6
poissPMMA=0.37

frictAngle=radians(26.57)

#Add material
#plexiglass as lammps
O.materials.append(FrictMat(young=yPMMA, poisson=poissPMMA,
frictionAngle=frictAngle,density=densPMMA, label='lmpPMMA'))

#Steel as lammps
O.materials.append(FrictMat(young=ySteel, poisson=poisSteel,
frictionAngle=frictAngle,density=densSteel, label='lmpSteel'))

#rayleigh time
tRay=math.pi*rball*(densSteel/shearModStell)**(0.5)/(0.1631*poisSteel+0.8766)

#container
coneId=O.bodies.append(geom.facetCone(Vector3(0,0,hcone/2.),rcont,rconelow,hcone,orientation=Quaternion((0,0,1),0),wallMask=(False*1+True*2+True*4),material='lmpPMMA'))
cyliId=O.bodies.append(geom.facetCylinder(Vector3(0,0,hcil/2.+hcone),
rcont,hcil,orientation=Quaternion((0,0,1),0),
wallMask=(True*1+False*2+True*4),material='lmpPMMA'))
contenitore=coneId+cyliId

nBodyCont=len(O.bodies) #60 facets

track=[]
for k in range(nBodyCont,len(O.bodies)):
track.append(O.bodies[k])

f1=open(str(sys.argv[2]), 'w')

O.engines=[
      ForceResetter(),
InsertionSortCollider([Bo1_Sphere_Aabb(),Bo1_Facet_Aabb(),],label='collider',verletDist=rball*0.3),
InteractionLoop([Ig2_Sphere_Sphere_ScGeom(),Ig2_Facet_Sphere_ScGeom(),
                  ], [Ip2_FrictMat_FrictMat_MindlinPhys(betan=0.4,
betas=0.4),], [Law2_ScGeom_MindlinPhys_Mindlin(),],),
      NewtonIntegrator(gravity=(0,0,-9.8),damping=0.0),
      ]

O.engines = O.engines + [HarmonicMotionEngine(ids = contenitore, A =
(0,0,A), f = (0,0,fr), label='shaker')]
O.engines = O.engines + [PyRunner(command = "print O.iter,
O.time,O.speed,
O.realtime,kineticEnergy(),len(O.bodies)-nBodyCont",iterPeriod=10000)]+[PyRunner(command
=
"adder()",iterPeriod=10000),PyRunner(command =
"printator()",iterPeriod=10000)]

O.timingEnabled=True

O.dt=0.2*tRay
O.run(500000)
O.wait()
exit()

On 11/30/18 6:07 PM, Bruno Chareyre wrote:
> Your question #676451 on Yade changed:
> https://answers.launchpad.net/yade/+question/676451
>
> Status: Open => Answered
>
> Bruno Chareyre proposed the following answer:
> Welcome.
> That would be interesting for me to see:
>
> N=.. | YadeTime | LammpsTime
> 200
> 400
> ...
> 2400
>
> I was checking Ig2_Facet_Sphere and there is certainly a way to avoid
> some matrix manipulation by escaping sooner when there is no contact.
> However I'll not hurry on that one though and it would not change you
> life by orders of magnitude anyway.
>
> Besides, it is not a surprise if details of the Hertz models differ, and
> that would be interesting top know the consequence from a physics point
> of view also.
>
> Looking forward. :)
>
> Bruno
>

Sorry for the late but i was far away from the office.

This is my analysis for the total time involved to perform the same 
simulation (both lammps and yade write on file and insert particle 
almost in the same way).

#Nballl TotCPU YADE
200 23.1
400 42.9
800 85.4
1200 144.9
1600 225.7
2000 416.3

#Nballl TotCPU LAMMPS
200 15.3
400 25.1
800 36.9
1200 44.8
1600 60.1
2000 78.0

I give you the YADE script used. Run it as

time yade -j 4 -n MyScript.py Nball logfile

i use the command time that print you the total time at the end of the 
execution.

Andrea

---------------------------------------------------------------------------------------------
from yade import pack,ymport,export,geom,bodiesHandling,timing
import numpy as np
import math
import sys
import random

def printator():
     neff=len(O.bodies)-nBodyCont
     f1.write(str(O.iter)+' '+str(O.time)+' '+str(O.speed)+' 
'+str(O.realtime)+' '+str(kineticEnergy())+' '+str(neff)+'\n')

def adder():
      if(len(O.bodies)<Nball+nBodyCont):
         sp=pack.SpherePack()
sp.makeCloud(minCorner=(-rcont/1.4,-rcont/1.4,hcone),maxCorner=(rcont/1.4,rcont/1.4,hcone+hcil),rMean=rball,num=Nball+nBodyCont-len(O.bodies))
         sp.toSimulation()

#Dimensioni utili
rball=0.002
hcone=6.37*rball
rcont=22.5*rball
rconelow=4*rball
hcil=17.13*rball

Nball=int(sys.argv[1]) #Number Of Ball
A=0.00025 #amplitude shaker
fr=200 #freq shaker

#Steel
densSteel=8000
ySteel=21e7 #originale e9
poisSteel=0.293
shearModStell=ySteel/(2*(1+poisSteel))

#Plaexiglass
densPMMA=1190
yPMMA=33e6
poissPMMA=0.37

frictAngle=radians(26.57)

#Add material
#plexiglass as lammps
O.materials.append(FrictMat(young=yPMMA, poisson=poissPMMA,
frictionAngle=frictAngle,density=densPMMA, label='lmpPMMA'))

#Steel as lammps
O.materials.append(FrictMat(young=ySteel, poisson=poisSteel,
frictionAngle=frictAngle,density=densSteel, label='lmpSteel'))

#rayleigh time
tRay=math.pi*rball*(densSteel/shearModStell)**(0.5)/(0.1631*poisSteel+0.8766)

#container
coneId=O.bodies.append(geom.facetCone(Vector3(0,0,hcone/2.),rcont,rconelow,hcone,orientation=Quaternion((0,0,1),0),wallMask=(False*1+True*2+True*4),material='lmpPMMA'))
cyliId=O.bodies.append(geom.facetCylinder(Vector3(0,0,hcil/2.+hcone), 
rcont,hcil,orientation=Quaternion((0,0,1),0), 
wallMask=(True*1+False*2+True*4),material='lmpPMMA'))
contenitore=coneId+cyliId

nBodyCont=len(O.bodies) #60 facets

track=[]
for k in range(nBodyCont,len(O.bodies)):
      track.append(O.bodies[k])

f1=open(str(sys.argv[2]), 'w')

O.engines=[
      ForceResetter(),
InsertionSortCollider([Bo1_Sphere_Aabb(),Bo1_Facet_Aabb(),],label='collider',verletDist=rball*0.3),
InteractionLoop([Ig2_Sphere_Sphere_ScGeom(),Ig2_Facet_Sphere_ScGeom(),
                  ], [Ip2_FrictMat_FrictMat_MindlinPhys(betan=0.4, 
betas=0.4),], [Law2_ScGeom_MindlinPhys_Mindlin(),],),
      NewtonIntegrator(gravity=(0,0,-9.8),damping=0.0),
      ]

O.engines = O.engines + [HarmonicMotionEngine(ids = contenitore, A = 
(0,0,A), f = (0,0,fr), label='shaker')]
O.engines = O.engines + [PyRunner(command = "print O.iter, 
O.time,O.speed, 
O.realtime,kineticEnergy(),len(O.bodies)-nBodyCont",iterPeriod=10000)]+[PyRunner(command 
=
"adder()",iterPeriod=10000),PyRunner(command = 
"printator()",iterPeriod=10000)]

O.timingEnabled=True

O.dt=0.2*tRay
O.run(500000)
O.wait()
exit()

On 11/30/18 6:07 PM, Bruno Chareyre wrote:
> Your question #676451 on Yade changed:
> https://answers.launchpad.net/yade/+question/676451
>
>      Status: Open => Answered
>
> Bruno Chareyre proposed the following answer:
> Welcome.
> That would be interesting for me to see:
>
> N=..    |   YadeTime  |   LammpsTime
> 200
> 400
> ...
> 2400
>
> I was checking Ig2_Facet_Sphere and there is certainly a way to avoid
> some matrix manipulation by escaping sooner when there is no contact.
> However I'll not hurry on that one though and it would not change you
> life by orders of magnitude anyway.
>
> Besides, it is not a surprise if details of the Hertz models differ, and
> that would be interesting top know the consequence from a physics point
> of view also.
>
> Looking forward. :)
>
> Bruno
>

Revision history for this message

Launchpad Janitor (janitor) said on 2018-12-21:

#16

This question was expired because it remained in the 'Open' state without activity for the last 15 days.

Yade

optimize camputational time for vibrated granular media

Question information

Subscribers

Yade

optimize camputational time for vibrated granular media

Question information

Related bugs

Related FAQ:

Subscribers