MPI with Clumps Particles

Asked by Nima Goudarzi

Hi,

I'm trying to use automatic parallelization for a large bed consisting of both spherical as well as clump particles. I had been able to use MPI for purely spherical particles (stabilization a bed of 500000 grains) but it seems that MPI has some issues with clumps as I receive errors in both interactive and passive modes preventing splitting and parallel running. I am wondering if anyone has used MPI for clump particles successfully or if there exist some specific considerations to do so including manual splitting or MPI customization. Also, I highly appreciate it if someone can lead me to some examples of YADE MPI with clumps.

Thanks so much,

Question information

Language:
English Edit question
Status:
Expired
For:
Yade Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Robert Caulk (rcaulk) said :
#1

Hello Nima,

I am not an expert in MPI but I will advise you to provide more information to try and get the best help possible. You mention "errors in interactive and passive modes." We do not know what that means, you will need to be more clear i.e. provide the errors directly here. Also, you need to add an MWE [1].

[1]https://www.yade-dem.org/wiki/Howtoask

Thanks,

Robert

Revision history for this message
Launchpad Janitor (janitor) said :
#2

This question was expired because it remained in the 'Needs information' state without activity for the last 15 days.

Revision history for this message
Nima Goudarzi (nimagoudarzi58) said :
#3

Hi all,

I have tried a lot to run a simulation with MPI with grains as a mixture of Clump and Spherical particles. Unwantedly (the procedure for grain generation is complex, and I have to run it in a separate script), The grains are imported to the scene. Therefore, they are not generated in the simulation. My analysis consists of the penetration of an excavator bucket into a medium dense bed (n=0.44) and then rotation about a zero point that excavates the bed. I, therefore, used the CombinedKinematicEngine to handle both motions of the excavator. I have used many MPI customizations for this simulation with little success, which casts the doubt if MPI (with its automatic splitting) is appropriate for grains other than perfect spheres. My simulation crashes after a few hundred of iterations, and this is what I receive:

malloc(): invalid size (unsorted)
[ubuntu-office:998639] *** Process received signal ***
[ubuntu-office:998639] Signal: Aborted (6)
[ubuntu-office:998639] Signal code: (-6)
[ubuntu-office:998639] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x46210)[0x7fcb6eebf210]
[ubuntu-office:998639] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7fcb6eebf18b]
[ubuntu-office:998639] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7fcb6ee9e859]
[ubuntu-office:998639] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x903ee)[0x7fcb6ef093ee]
[ubuntu-office:998639] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x9847c)[0x7fcb6ef1147c]
[ubuntu-office:998639] [ 5] /lib/x86_64-linux-gnu/libc.so.6(+0x9b234)[0x7fcb6ef14234]
[ubuntu-office:998639] [ 6] /lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x1b9)[0x7fcb6ef16419]
[ubuntu-office:998639] [ 7] /usr/bin/python3.8(PyBytes_FromStringAndSize+0x231)[0x5f7b11]
[ubuntu-office:998639] [ 8] /usr/lib/python3/dist-packages/mpi4py/MPI.cpython-38-x86_64-linux-gnu.so(+0x724d5)[0x7fcb53e6f4d5]
[ubuntu-office:998639] [ 9] /usr/lib/python3/dist-packages/mpi4py/MPI.cpython-38-x86_64-linux-gnu.so(+0x91642)[0x7fcb53e8e642]
[ubuntu-office:998639] [10] /usr/bin/python3.8(PyCFunction_Call+0x59)[0x5f2cc9]
[ubuntu-office:998639] [11] /usr/bin/python3.8(_PyObject_MakeTpCall+0x23f)[0x5f30ff]
[ubuntu-office:998639] [12] /usr/bin/python3.8(_PyEval_EvalFrameDefault+0x6246)[0x5705f6]
[ubuntu-office:998639] [13] /usr/bin/python3.8(_PyFunction_Vectorcall+0x1b6)[0x5f5956]
[ubuntu-office:998639] [14] /usr/bin/python3.8(_PyEval_EvalFrameDefault+0x72f)[0x56aadf]
[ubuntu-office:998639] [15] /usr/bin/python3.8(_PyFunction_Vectorcall+0x1b6)[0x5f5956]
[ubuntu-office:998639] [16] /usr/bin/python3.8(_PyEval_EvalFrameDefault+0x72f)[0x56aadf]
[ubuntu-office:998639] [17] /usr/bin/python3.8(_PyFunction_Vectorcall+0x1b6)[0x5f5956]
[ubuntu-office:998639] [18] /usr/bin/python3.8(_PyEval_EvalFrameDefault+0x57d7)[0x56fb87]
[ubuntu-office:998639] [19] /usr/bin/python3.8(_PyEval_EvalCodeWithName+0x26a)[0x568d9a]
[ubuntu-office:998639] [20] /usr/bin/python3.8(PyEval_EvalCode+0x27)[0x68cdc7]
[ubuntu-office:998639] [21] /usr/bin/python3.8[0x67e161]
[ubuntu-office:998639] [22] /usr/bin/python3.8[0x67e1df]
[ubuntu-office:998639] [23] /usr/bin/python3.8(PyRun_StringFlags+0x7f)[0x67e32f]
[ubuntu-office:998639] [24] /usr/bin/python3.8(PyRun_SimpleStringFlags+0x3f)[0x67e38f]
[ubuntu-office:998639] [25] /home/ngoudarzi/Desktop/My Own YADE/install/lib/x86_64-linux-gnu/yade-2020.01a/libyade.so(_Z11pyRunStringRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x1c)[0x7fcb6d40622c]
[ubuntu-office:998639] [26] /home/ngoudarzi/Desktop/My Own YADE/install/lib/x86_64-linux-gnu/yade-2020.01a/libyade.so(_ZN4yade5Scene18moveToNextTimeStepEv+0x132)[0x7fcb6c397922]
[ubuntu-office:998639] [27] /home/ngoudarzi/Desktop/My Own YADE/install/lib/x86_64-linux-gnu/yade-2020.01a/libyade.so(_ZN4yade14SimulationFlow12singleActionEv+0x78)[0x7fcb6c3b5d58]
[ubuntu-office:998639] [28] /home/ngoudarzi/Desktop/My Own YADE/install/lib/x86_64-linux-gnu/yade-2020.01a/libyade.so(_ZN4yade12ThreadWorker16callSingleActionEv+0x14)[0x7fcb6c3d60f4]
[ubuntu-office:998639] [29] /home/ngoudarzi/Desktop/My Own YADE/install/lib/x86_64-linux-gnu/yade-2020.01a/libyade.so(_ZN4yade12ThreadRunner4callEv+0x3f)[0x7fcb6c3d3fdf]
[ubuntu-office:998639] *** End of error message ***
--------------------------------------------------------------------------
Child job 2 terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------

I think I am missing some important adjustments for the simulation but need to know first that MPI works with imported clumps or it has only been designed for spherical particles.

This is my last script for the simulation. My target, for now, is understanding if there is a significant issue in the structure of the input and therefore, all inputs (Stl, clumps) and outputs are empty:

from yade import pack
from yade import plot
from yade import ymport
from yade import export
import os
from yade import mpy as mp
############################################
### INPUT PARAMETERS ###
############################################
NSTEPS=1000 #turn it >0 to see time iterations, else only initilization
numThreads = 9 # number of threads to be spawned, (in interactive mode).
velocity=10 #m/s THAT SHOULD BE 0.01 m/s
angularVelocity=10 #RAD/S
gravityAcc=-9.81
atRestFricDegree = 40 # INITIAL CONTACT FRICTION FOR SAMPLE PREPARATION
normalDamp=0.3 # NUMERICAL DAMPING
shearDamp=0.3
youngSoil=0.7e8# CONTACT STIFFNESS FOR SOIL
youngContainer=210e9 # CONTACT STIFFNESS FOR CONTAINER
poissonSoil=0.3 # POISSION'S RATIO FOR SOIL
poissionContainer=0.25 # POISSION'S RATIO FOR CONTAINER
densSoil=2650 # DENSITY FOR SOIL
densContainer=7850 # DENSITY FOR CONTAINER
numDamp=0.4
InitialPeneterationofExcavator=0.7e-3

O.materials.append(FrictMat(young=youngContainer,poisson=poissionContainer,frictionAngle=radians(0),density=densContainer,label='excavator'))
O.materials.append(FrictMat(young=youngSoil,poisson=poissonSoil,frictionAngle=radians(atRestFricDegree),density=densSoil,label='Soil'))

ymport.textClumps("EMPTY",material='Soil')

mn,mx=aabbExtrema()
walls=aabbWalls([mn,mx],thickness=0.0001, oversizeFactor=1.0,material='excavator')
#for w in walls: w.shape.wire=False
O.bodies.append(walls[:3]+walls[4:]) #don't insert top wall

Excavator = O.bodies.append(ymport.stl('EMPTY',material='excavator',wire=True,color=Vector3(255,178,102)))
ExcavatorID = [b for b in O.bodies if isinstance(b.shape,Facet)] # list of facets in simulation

O.engines = [VTKRecorder(fileName='EMPTY' ,parallelMode=True, recorders=['all'],iterPeriod=50000),
 ForceResetter(),
 InsertionSortCollider([Bo1_Sphere_Aabb(),Bo1_Box_Aabb(),Bo1_Facet_Aabb()], label="collider"),
 InteractionLoop(
  [Ig2_Sphere_Sphere_ScGeom(),Ig2_Box_Sphere_ScGeom(),Ig2_Facet_Sphere_ScGeom()],
  [Ip2_FrictMat_FrictMat_MindlinPhys(betan=normalDamp,betas=shearDamp,label='ContactModel')],
  [Law2_ScGeom_MindlinPhys_Mindlin(label='Mindlin')]
 ),

 GlobalStiffnessTimeStepper(active=1,timeStepUpdateInterval=100,parallelMode=True, timestepSafetyCoefficient=0.35),
 PyRunner(iterPeriod=1, command="penetration()" ,label='checker'),
 CombinedKinematicEngine(ids=Excavator,label='combEngine') + TranslationEngine(translationAxis=(0,-1,0),velocity=velocity) + RotationEngine(rotationAxis=(0,0,1), angularVelocity=0, rotateAroundZero=True, zeroPoint=(0.005851,0.009186,0.003200)),
 NewtonIntegrator(damping=numDamp,gravity=(0,gravityAcc,0)),
 PyRunner(iterPeriod=100,command='history()',label='recorder'),
]
#collider.verletDist = 0.0007
# collider.targetInterv=200
#O.dt=0.1*PWaveTimeStep()

# get TranslationEngine and RotationEngine from CombinedKinematicEngine
transEngine, rotEngine = combEngine.comb[0], combEngine.comb[1]
initialZeroPointVertical=rotEngine.zeroPoint[1]
print ("Zero Point Initial Vertical Position:",initialZeroPointVertical)

def penetration():
 transEngine.velocity = velocity
 print ("Zero Point Vertical Position:",rotEngine.zeroPoint[1])
 print ("Iteration:",O.iter)

 #rotEngine.angularVelocity = angularVelocity
 rotEngine.zeroPoint += Vector3(0,-1,0)*velocity*O.dt
 if rotEngine.zeroPoint[1]<=(initialZeroPointVertical-InitialPeneterationofExcavator):
  checker.command='updateKinematicEngines()'

def updateKinematicEngines():
 transEngine.translationAxis=(0,0,1)
 transEngine.velocity = 0
 print ("Zero Point Vertical Position:",rotEngine.zeroPoint[1])
 rotEngine.angularVelocity = angularVelocity
 #rotEngine.zeroPoint += Vector3(0,0,1)*velocity*O.dt

def history():
  global Fx,Fy,Fz
  Fx=0
  Fy=0
  Fz=0
  for b in ExcavatorID:
    Fx+=O.forces.f(b.id,sync=True)[0]
    Fy+=O.forces.f(b.id,sync=True)[1]
    Fz+=O.forces.f(b.id,sync=True)[2]
  yade.plot.addData({'i':O.iter,'Fx':Fx,'Fy':Fy,'Fz':Fz,})
  ## In that case we can still save the data to a text file at the the end of the simulation, with:
  plot.saveDataTxt('EMPTY')

## declare what is to plot. "None" is for separating y and y2 axis
plot.plots={'i':('Fx','Fy','Fz')}
plot.plot()

######## RUN ##########
#customize mpy
mp.ACCUMULATE_FORCES=True
mp.ERASE_REMOTE_MASTER = True #keep remote bodies in master?
mp.DOMAIN_DECOMPOSITION= True #automatic splitting/domain decomposition
mp.OPTIMIZE_COM=True
#mp.mpirun(NSTEPS) #passive mode run
mp.MERGE_W_INTERACTIONS = False
mp.mpirun(NSTEPS,numThreads,withMerge=True) # interactive run, numThreads is the number of workers to be initialized, see below for withMerge explanation.
mp.mergeScene() #merge scene after run.
if mp.rank == 0: O.save('mergedScene.yade')

# #demonstrate getting stuff from workers, here we get kinetic energy from worker subdomains, notice that the master (mp.rank = 0), uses the sendCommand to tell workers to compute kineticEnergy.
# if mp.rank==0:
# print("kinetic energy from workers: "+str(mp.sendCommand([1,2],"kineticEnergy()",True)))

Thanks so much

Revision history for this message
Robert Caulk (rcaulk) said (last edit ):
#4

Hey Nima,

>>need to know first that MPI works with imported clumps or it has only been designed for spherical particles.

It does not. It was not designed for clumps.

I chatted with Bruno about this, and it seems there need to be some changes to the source code to accommodate clumps in MPI. The yade mpi module does not know how to handle a clump that crosses two subdomains.

Further technical direction will need to come from Bruno directly, but at least you know now that you need to investigate the subdomain source code for how it handles bodies and interactions, and try to make sure it can accommodate clump members or some how ensure that all clump members are a part of the same subdomain.

Cheers,

Robert

Revision history for this message
Nima Goudarzi (nimagoudarzi58) said :
#5

Hi Robert.

Many thanks for getting back to me. Indeed, this was my first guess for an unsuccessful simulation but something prevents me from attributing everything to the inability of sound splitting when clumps are present. YADE splits the domain per my request for the number of threads meaning that it is either able to cross the boundaries beyond the whole clump or pass it through them. In the latter case (if it is possible), the clumps are split at the boundaries which could be the origin of the error I receive but I cannot verify if this is the case. The other weird thing I'm encountering is that the simulation runs for a few thousand iterations and then is terminated suddenly. I assume if YADE cannot handle clumps, as you suggested, the simulation shouldn't even start.
Regarding this, I can consider the possibility of missing some mpi customization for ranks and subdomains since I am using the automatic splitting scheme from one of the mpi examples. Any thoughts on this possibility are also welcome. To a lesser extent, I also am doubtful about the consistency of the mpi with the engines I have used for translation and rotation of the imported .stl file.
FYI, I sometimes get a similar error with imported purely spherical particles and in a gravity collapse test.

Looking forward to hearing back from Bruno and taking any required actions to resolve the issue.

Much obliged

Nima

Revision history for this message
Bruno Chareyre (bruno-chareyre) said :
#6

Hi,
I can imagine why it fails.
Do you confirm the problem without the motion engines?
Possible directions for solving the problem:
1- Assign the subdomains by yourself, making sure all members of one clump have the same domain assigned (though I suspect even this may not be enough)
2- (more a workaround) don't use clumps at all, instead make cohesive aggregates. You loose the enhanced performance of computing truly rigid clumps, but if you have hundreds of available cores it could still be a progress compared to openMP.

Cheers

Bruno

Revision history for this message
Nima Goudarzi (nimagoudarzi58) said :
#7

Hi Bruno,

Thanks so much. I think both options are tedious but I am interested in trying 1 first. Are there any examples showing me how to do this for clumps? Do I need to predefined subdomains manually (I mean grouping clumps) before importing or I am able to do so in the main analysis script? For 2, I don't have access to hundreds of cores, unfortunately.
I highly appreciate it if I'd receive some guidance for 1.

Much Obliged

Nima

Revision history for this message
Launchpad Janitor (janitor) said :
#8

This question was expired because it remained in the 'Open' state without activity for the last 15 days.