Any method about accelerating flowengine simulation

Asked by Ziyu Wang

Hi,
I tried the failure process of triaxial compression under fluid-solid coupling. I previously carried out triaxial compression without flowengine, and the calculation speed was about 80 step/s(Acceptable..).

However,when I added flowengine for coupling, I found that the calculation speed was too slow (20/s). Since the previous triaxial compression simulation was already slow, I think I can't have more particles (8K at present), so it is not applicable to the GPU acceleration recommended on the yade website.

I have attached my code below. Is there anyway I can make it run faster?
#####################
from yade import pack, ymport, plot, utils, export, timing
young=66.2e9
poisson=0.522
name='cylinder'
preStress=-30e6
strainRate = -0.1
OUT=str(preStress)+'_JCFPM_triax'

nw=24
nh=15
rParticle=0.000731723
bcCoeff = 5
width = 0.025
height = 0.05
mnx=-0.011871843
mny=-0.0118675
mnz=0.00072483840000000003
mxx=0.011855783999999999
mxy=0.011865682000000001
mxz=0.049744924000000003
Sy=2e6

O.materials.append(JCFpmMat(type=1,density=2640,young=young,poisson=poisson,tensileStrength=8e6,cohesion=40e6,frictionAngle=radians(25),label='sphere'))
O.materials.append(JCFpmMat(type=1,label='wall1',young=young,poisson=poisson,frictionAngle=radians(25)))
O.materials.append(JCFpmMat(type=1,frictionAngle=0,density=0,label='wall2'))

mn,mx=Vector3(mnx,mny,mnz),Vector3(mxx,mxy,mxz)
walls=aabbWalls([mn,mx],thickness=0,material='wall2')
wallIds=O.bodies.append(walls)

bottom,top=Vector3(0,0,0),Vector3(0,0,0.05)
radius=0.0125
sp=pack.SpherePack()
pred=pack.inCylinder(bottom,top,radius)
sp=pack.randomDensePack(pred,radius=0.0005,rRelFuzz=0,returnSpherePack=True,memoizeDb='/tmp/triax.sqlite')
spheres=sp.toSimulation(material='sphere',color=(0.9,0.8,0.6))

bot = [O.bodies[s] for s in spheres if O.bodies[s].state.pos[2]<rParticle*bcCoeff]
top = [O.bodies[s] for s in spheres if O.bodies[s].state.pos[2]>height-rParticle*bcCoeff]
vel = strainRate*(height-rParticle*2*bcCoeff)
for s in bot:
 s.shape.color = (1,0,0)
 s.state.blockedDOFs = 'xyzXYZ'
 s.state.vel = (0,0,-vel)
for s in top:
 s.shape.color = Vector3(0,1,0)
 s.state.blockedDOFs = 'xyzXYZ'
 s.state.vel = (0,0,vel)

facets = []
rCyl2 = .5*width / cos(pi/float(nw))
for r in range(nw):
 for h in range(nh):
  v1 = Vector3( rCyl2*cos(2*pi*(r+0)/float(nw)), rCyl2*sin(2*pi*(r+0)/float(nw)), height*(h+0)/float(nh) )
  v2 = Vector3( rCyl2*cos(2*pi*(r+1)/float(nw)), rCyl2*sin(2*pi*(r+1)/float(nw)), height*(h+0)/float(nh) )
  v3 = Vector3( rCyl2*cos(2*pi*(r+1)/float(nw)), rCyl2*sin(2*pi*(r+1)/float(nw)), height*(h+1)/float(nh) )
  v4 = Vector3( rCyl2*cos(2*pi*(r+0)/float(nw)), rCyl2*sin(2*pi*(r+0)/float(nw)), height*(h+1)/float(nh) )
  f1 = facet((v1,v2,v3),color=(0,0,1),material='wall1')
  f2 = facet((v1,v3,v4),color=(0,0,1),material='wall1')
  facets.extend((f1,f2))

O.bodies.append(facets)
mass = O.bodies[0].state.mass
for f in facets:
 f.state.mass = mass
 f.state.blockedDOFs = 'XYZz'

def addForces():
 for f in facets:
  n = f.shape.normal
  a = f.shape.area
  O.forces.addF(f.id,preStress*a*n)

def stopIfDamaged(maxEps=0.001):
 extremum = max(abs(s) for s in plot.data['s'])
 s = abs(plot.data['s'][-1])
 e = abs(plot.data['e'][-1])
 if O.iter < 1000 or e < maxEps:
  return
 if abs(s)/abs(extremum) < 0.5 :
  print('Simulation finished')
  print('Max stress and strain:',extremum,e)
  O.pause()

O.dt=.5*utils.PWaveTimeStep()
newton=NewtonIntegrator(damping=0.4)

O.engines=[
 ForceResetter(),
 InsertionSortCollider([Bo1_Sphere_Aabb(aabbEnlargeFactor=1.5,label='is2aabb'),Bo1_Facet_Aabb()]),
 InteractionLoop(
  [Ig2_Sphere_Sphere_ScGeom(interactionDetectionFactor=1.5,label='ss2sc'),Ig2_Facet_Sphere_ScGeom()],
  [Ip2_JCFpmMat_JCFpmMat_JCFpmPhys()],
  [Law2_ScGeom_JCFpmPhys_JointedCohesiveFrictionalPM(recordCracks=True,Key=OUT+'_Crack',label='interactionLaw')]
 ),
 PyRunner(iterPeriod=1,command="addForces()"),
 FlowEngine(dead=1,label="flow",multithread=False),
 newton,
 PyRunner(command='plotAddData()',iterPeriod=10),
 PyRunner(iterPeriod=20,command='stopIfDamaged()'),
]

def plotAddData():
 f1 = sum(O.forces.f(b.id)[2] for b in top)
 f2 = sum(O.forces.f(b.id)[2] for b in bot)
 f = .5*(f2-f1)
 s = f/(pi*.25*width*width)
 e = (top[0].state.displ()[2] - bot[0].state.displ()[2]) / (height-rParticle*2*bcCoeff)
 plot.addData(
  i = O.iter,
  s = -s,
  e = -e,
 )
 plot.saveDataTxt(OUT)

O.step()
ss2sc.interactionDetectionFactor=-1
is2aabb.aabbEnlargeFactor=-1

flow.debug=False
flow.permeabilityMap = False
flow.fluidBulkModulus=2.2e9
flow.useSolver=4
flow.permeabilityFactor=1
flow.viscosity= 0.001
flow.decoupleForces = False
flow.boundaryUseMaxMin=[0,0,0,0,0,0]
flow.bndCondIsPressure=[1,1,1,1,1,1]
flow.bndCondValue=[0,0,0,0,0,Sy]
flow.bndCondIsTemperature=[0,0,0,0,0,0]
flow.thermalEngine=False
flow.thermalBndCondValue=[0,0,0,0,0,0]
flow.dead=0
flow.emulateAction()
#################blockcells################
numPoints = 100
xs = np.linspace(1.05*mnx,1.05*mxx,numPoints) #np.linspace is to divide the distance between 2 points
ys = np.linspace(1.05*mny,1.05*mxy,numPoints) #if the factor for mx change to less than 1, code will not work properly.
zs = np.linspace(0.95*mnz,1.05*mxz,numPoints)

cellsHit = [] #create array
cylcenterx=(mxx+mnx)/2
cylcentery=(mxy+mny)/2
for x,y,z in itertools.product(xs, ys, zs):
 cellId = flow.getCell(x,y,z) #getCell return cell id for a point position xyz
 if cellId in cellsHit: continue #this prevents counting a cell multiple times
 if np.sqrt((x-cylcenterx)**2+(y-cylcentery)**2) > width:
  cellsHit.append(cellId)

for i in cellsHit:
 flow.blockCell(i,blockPressure=True)
 flow.setCellPressure(i,0)
O.run(1,1)

flow.meshUpdateInterval=-1 #these two lines to prevent remeshing after 1000 run which unblock all cells in cellsHit
flow.defTolerance=-1

plot.plots = { 'e':('s',), 'elateral':('s'),}
plot.plot()
O.run()
###############################

Question information

Language:
English Edit question
Status:
Solved
For:
Yade Edit question
Assignee:
No assignee Edit question
Solved by:
Robert Caulk
Solved:
Last query:
Last reply:

This question was reopened

Revision history for this message
Ziyu Wang (ziyuwang1) said :
#1

Some information in printAllVersions() and my computer hardware configuration:
###
Yade version : 20220606-6616~ab84d1d~bionic1
Yade features : LOGGER USEFUL_ERRORS VTK OPENMP GTS QT5 CGAL PFVFLOW PFVFLOW LINSOLV MPI TWOPHASEFLOW LS_DEM FEMLIKE GL2PS LBMFLOW THERMAL PARTIALSAT POTENTIAL_PARTICLES POTENTIAL_BLOCKS
Yade config dir: ~/.config/yadedaily
Yade precision : 53 bits, 15 decimal places, without mpmath, PrecisionDouble
```

Libraries used :

| library | cmake | C++ |
| ------------- | -------------------- | ------------------- |
| boost | 106501 | 1.65.1 |
| cgal | | 4.11 |
| clp | 1.16.11 | 1.16.11 |
| cmake | 3.10.2 | |
| coinutils | 2.10.14 | 2.10.14 |
| compiler | /usr/bin/c++ 7.5.0 | gcc 7.5.0 |
| eigen | 3.3.4 | 3.3.4 |
| freeglut | 2.8.1 | |
| gl | | 20190911 |
| ipython | 5.5.0 | |
| metis | | 5.1.0 |
| mpi | 3.1 | ompi:2.1.1 |
| mpi4py | 2.0.0 | |
| mpmath | 1.0.0 | |
| openblas | | OpenBLAS 0.2.20 |
| python | 3.6.9 | 3.6.9 |
| qglviewer | | 2.6.3 |
| qt | | 5.9.5 |
| sphinx | 1.6.7-final-0 | |
| sqlite | | 3.22.0 |
| suitesparse | 5.1.2 | 5.1.2 |
| vtk | 6.3.0 | 6.3.0 |

```
Linux version : Ubuntu 18.04.6 LTS
Architecture : amd64
Little endian : True
###

About hardware:
Memory 15.6GiB
processor AMD® Ryzen 7 4800h with radeon graphics × 16
Graphics card NVIDIA GeForce RTX 2060/PCIe/SSE2
disk 101.7 GB

Revision history for this message
Robert Caulk (rcaulk) said :
#2

Hello,

We happened to write a paper exactly on this topic:

https://www.sciencedirect.com/science/article/abs/pii/S0010465519303340

Cheers,

Robert

Revision history for this message
Ziyu Wang (ziyuwang1) said :
#3

Thanks Robert Caulk, that solved my question.

Revision history for this message
Ziyu Wang (ziyuwang1) said :
#4

Hi,Robert.
Can you help me see if my computer configuration can achieve GPU acceleration?

###
Memory 15.6GiB
processor AMD® Ryzen 7 4800h with radeon graphics × 16
Graphics card NVIDIA GeForce RTX 2060/PCIe/SSE2
disk 101.7 GB

Revision history for this message
Best Robert Caulk (rcaulk) said :
#5

Hello,

Flow engine depends on double precision so it depends on the double precision gflops of your GPU. For this GPU you can find all the metrics at:

https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_20_series_2

144 GFLOPS

Which is quite low, likely lower than your CPU.

Cheers,

robert

Revision history for this message
Ziyu Wang (ziyuwang1) said :
#6

Thanks Robert Caulk, that solved my question.