Batch run and run alone get different results

Asked by Ziyu Wang

Hello,
I'm working on parameter calibration, so I want to use batch processing to save time and improve efficiency.But I found that the same parameters get very different results in separate run and batch run.I am confused.

My yadedaily is 20220218-6333~d331682~bionic1 and ubuntu is 18.04.
I run the batch through yadedaily-batch -j 12 --job-threads 2 young.txt JCFpm_dense_triaxial.py
the separate through yadedaily -j 12 JCFpm_dense_triaxial.py

Thanks for help!
Following is my script:

----------------------------------------------------
from yade import pack, ymport, plot, utils, export, timing
import numpy as np
import sys

readParamsFromTable(young=3000e6)
from yade.params import table
Wy=-30e6
rate=-0.1
damp=0.4
stabilityThreshold=0.001
key='_triax_base_'
young=table.young
name='JCFPM_triax'
compFricDegree=30
poisson=0.4
OUT=str(young)+'_JCFPM_triax'

mn,mx=Vector3(0,0,0),Vector3(0.05,0.05,0.05)
O.materials.append(JCFpmMat(type=1,density=2640,young=young,poisson=poisson,tensileStrength=160e6,cohesion=1600e6,frictionAngle=radians(10),label='sphere'))
O.materials.append(JCFpmMat(type=1,frictionAngle=0,density=0,label='wall'))

walls=aabbWalls([mn,mx],thickness=0,material='wall')
wallIds=O.bodies.append(walls)

O.bodies.append(ymport.text('packing-JCFPM.spheres',scale=1,shift=Vector3(0,0,0),material='sphere',color=(0.6,0.5,0.15)))
#(packing here is generated by randomDensePack with 5000 particles)

triax=TriaxialStressController(
 maxMultiplier=1.+4e8/young,
 finalMaxMultiplier=1.+4e7/young,
 thickness = 0,
 stressMask = 7,
 internalCompaction=True,
)
newton=NewtonIntegrator(damping=damp)

def recorder():
 yade.plot.addData(
 i=O.iter,
 e11=-triax.strain[0],e22=-triax.strain[1],e33=-triax.strain[2],
 s11=-triax.stress(triax.wall_right_id)[0],
 s22=-triax.stress(triax.wall_top_id)[1],
 s33=-triax.stress(triax.wall_front_id)[2],
 numberTc=interactionLaw.nbTensCracks,
 numberSc=interactionLaw.nbShearCracks,
 unb=unbalancedForce()
)
 plot.saveDataTxt(OUT)

def stop_condition():
 extremum=max(abs(s) for s in plot.data['s33'])
 s=abs(plot.data['s33'][-1])
 e=abs(plot.data['e33'][-1])
 if e < 0.001 :
  return
 if abs(s)/abs(extremum) < 0.9:
  O.pause()
  print('Max stress and strain:',extremum,e)
  presentcohesive_count = 0
  for i in O.interactions:
          if hasattr(i.phys, 'isCohesive'):
               if i.phys.isCohesive == True:
                   presentcohesive_count+=1
  print('the number of cohesive bond now is:',presentcohesive_count)

O.engines=[
 ForceResetter(),
 InsertionSortCollider([Bo1_Sphere_Aabb(aabbEnlargeFactor=1.15,label='is2aabb'),Bo1_Box_Aabb()]),
 InteractionLoop(
  [Ig2_Sphere_Sphere_ScGeom(interactionDetectionFactor=1.15,label='ss2sc'),Ig2_Box_Sphere_ScGeom()],
  [Ip2_JCFpmMat_JCFpmMat_JCFpmPhys()],
  [Law2_ScGeom_JCFpmPhys_JointedCohesiveFrictionalPM(recordCracks=True,Key=OUT+'_Crack',label='interactionLaw')]
 ),
 GlobalStiffnessTimeStepper(active=1,timeStepUpdateInterval=10,timestepSafetyCoefficient=0.8),
 triax,
 TriaxialStateRecorder(iterPeriod=100,file='WallStresses'+key),
 PyRunner(iterPeriod=int(1000),initRun=True,command='recorder()',label='data',dead=0),
 PyRunner(iterPeriod=1000,command='stop_condition()',dead=0),
 VTKRecorder(iterPeriod=500,initRun=True,fileName='triax/JFFPM-',recorders=['spheres','jcfpm','cracks'],Key=OUT+'_Crack',label='vtk',dead=1),
 newton,
]

O.step()
ss2sc.interactionDetectionFactor=-1
is2aabb.aabbEnlargeFactor=-1
cohesiveCount = 0
for i in O.interactions:
 if hasattr(i.phys, 'isCohesive'):
  if i.phys.isCohesive == True:
   cohesiveCount+=1
print('the origin total number of cohesive bond is:',cohesiveCount)

triax.goal1=triax.goal2=triax.goal3=Wy
while 1:
 #global Wy
 O.run(500,1)
 unb=unbalancedForce()
 print('unbalanced force:',unb,'mean stres:',triax.meanStress)
 if unb<stabilityThreshold and abs(Wy-triax.meanStress)/abs(Wy)<0.001:
  break

cohesiveCount = 0
for i in O.interactions:
 if hasattr(i.phys, 'isCohesive'):
  if i.phys.isCohesive == True:
   cohesiveCount+=1
print('the first total number of cohesive bond is:',cohesiveCount)

triax.internalCompaction=False
triax.stressMask=3
triax.goal1=Wy
triax.goal2=Wy
triax.goal3=rate

plot.plots={'e33':('s33',None,'unb'),'i':('numberTc','numberSc',None,'s33')}
plot.plot()
O.run()
waitIfBatch()
-------------------------------------------------------------------

Question information

Language:
English Edit question
Status:
Solved
For:
Yade Edit question
Assignee:
No assignee Edit question
Solved by:
Jan Stránský
Solved:
Last query:
Last reply:
Revision history for this message
Robert Caulk (rcaulk) said :
#1

Hello,

>>I found that the same parameters get very different results

Which parameters? What does "very different" mean?

What is your young.txt file?

Cheers,

Robert

Revision history for this message
Ziyu Wang (ziyuwang1) said :
#2

Hello,
>which parameters?
As you can see, I want to calibrate Young's modulus, that is, what changes is the value of Young's modulus.
However, the key problem is that even if the same young's modulus is used, the results are still different.

>what does "very different" mean?
I set in the script to stop the simulation when the strength reaches 90% of the peak strength.(In def stop_condition()).
However,when I run it through separate yadedaily,it stops when strain[2] is 0.0331 and peak strength is about 120MPa,when I run it through batch mode,it stops when strain[2] is 0.161 and peak strength is about 161MPa.
The parameters used in the above two results (stress and strain values) are exactly the same, only the operation mode is different

>What is your young.txt file?
young.txt is a text file.The contents are as follows:
-----------
young
3000e6
4000e6
5000e6
6000e6
7000e6
8000e6
----------

Thanks for help!

Revision history for this message
Robert Caulk (rcaulk) said :
#3

>However,when I run it through separate yadedaily,it stops when strain[2] is 0.0331 and peak strength is about 120MPa,when I run it through batch mode,it stops when strain[2] is 0.161 and peak strength is about 161MPa.

How do these numbers change as you re-run them multiple times with exactly the same parameters?

Revision history for this message
Ziyu Wang (ziyuwang1) said :
#4

Hello Robert,
I have run the same parameters twice using batch mode and separate.
In batch mode,Both results are almost the same: the strain[2] is about 0.16 and the peak strength is about 161MPa.
While run it through separate yadedaily,The two results are almost the same: the strain[2] is about 0.03 and the peak strength is about 120MPa.
In this way, it seems that it is not the result of accidental circumstances?

Revision history for this message
Jan Stránský (honzik) said :
#5

Try without any -j or --job-threads (i.e. using one core for both cases)

> I have run the same parameters twice using batch mode

just to be sure, do you use same young parameter for both yadedaily and yadedaily-batch?
It is not mentioned, only mentioned is content of young.txt, were are various values..

Cheers
Jan

Revision history for this message
Ziyu Wang (ziyuwang1) said :
#6

Hi Jan
>Try without any -j or --job-threads (i.e. using one core for both cases)
As you said, I have run yadedaily-batch twice and yadedaily separate twice using one core(with the same parameters).However,the results obtained are consistent with those before, and there is still a large gap between the two.

>just to be sure, do you use same young parameter for both yadedaily and yadedaily-batch?
Yes,I use the same young(3000e9) for both.And all the parameters are same.The results(stress and strength) I mentioned are also obtained from the simulation of the same parameters.

Thanks for help.

Revision history for this message
Best Jan Stránský (honzik) said :
#7

> waitIfBatch()

this is the only reason I can see in the script for the deifference between batch and normal run.
How do you run/stop the simulation in non-batch run?
What is the result if you use O.wait() instead?

Cheers
Jan

Revision history for this message
Ziyu Wang (ziyuwang1) said :
#8

>How do you run/stop the simulation in non-batch run?

In batch and normal run,the way of run/stop are the same:O.run() and O.pause()

I will follow your suggestion and use O.wait(),and will give timely feedback.

Revision history for this message
Jan Stránský (honzik) said :
#9

Are you sure that in normal run you wait until actual O.pause() is executed?

Sorry if this seems like stupid notes, I just cannot see any reason for the difference.. so I catch every possible hint..

Cheers
Jan

Revision history for this message
Ziyu Wang (ziyuwang1) said :
#10

>Are you sure that in normal run you wait until actual O.pause() is executed?
Yes,I am sure.

Good news,Jan.Your idea is correct. When I change O.pause() to O.wait(), I get the same result as normal run in the case of single core or multi-core operation.
Thanks for help!

Revision history for this message
Ziyu Wang (ziyuwang1) said :
#11

Thanks Jan Stránský, that solved my question.

Revision history for this message
Jan Stránský (honzik) said :
#12

> When I change O.pause() to O.wait()

O.pause()? not waitIfBatch?

Jan

Revision history for this message
Ziyu Wang (ziyuwang1) said :
#13

Hi Jan

yes,the only change I made was to change O.pause() to O.wait(). The end of the code is still waitIfBatch.
In fact, this causes batch not to exit automatically after all runs are completed.However, the results can be consistent with the single operation..