mpi error at the time step of 50000

Asked by Wang Yaqiong

admin123@admin:/media/admin123/4A74F9CF442285AE/Wang YQ/example/未命名文件夹/60$ mpirun -np 6 esysparticle 111.py
Invalid MIT-MAGIC-COOKIE-1 keyCSubLatticeControler::initMPI()
CSubLatticeControler::initMPI()
CSubLatticeControler::initMPI()
CSubLatticeControler::initMPI()
CSubLatticeControler::initMPI()
slave started at local/global rank 4 / 5
slave started at local/global rank 3 / 4
slave started at local/global rank 0 / 1
slave started at local/global rank 2 / 3
slave started at local/global rank 1 / 2
......
......
......
Particle number = 96800 Step= 49000
top wall y position 0.043902837544149 top wall y force= -3599.174583186117
shear displacement= 0 bottom wall y force= 2060.2595043558654
calculated top y force 3599.1746599968887 output top y force 3599.174583186117 shear stress 0.5029988959774476
[admin:159255] *** Process received signal ***
[admin:159255] Signal: Segmentation fault (11)
[admin:159255] Signal code: Address not mapped (1)
[admin:159255] Failing at address: (nil)
[admin:159256] *** Process received signal ***
[admin:159256] Signal: Segmentation fault (11)
[admin:159256] Signal code: Address not mapped (1)
[admin:159256] Failing at address: (nil)
[admin:159252] *** Process received signal ***
[admin:159252] Signal: Segmentation fault (11)
[admin:159252] Signal code: Address not mapped (1)
[admin:159252] Failing at address: (nil)
[admin:159253] *** Process received signal ***
[admin:159253] Signal: Segmentation fault (11)
[admin:159253] Signal code: Address not mapped (1)
[admin:159253] Failing at address: (nil)
[admin:159254] *** Process received signal ***
[admin:159254] Signal: Segmentation fault (11)
[admin:159254] Signal code: Address not mapped (1)
[admin:159254] Failing at address: (nil)
[admin:159253] [ 0] [admin:159254] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x43090)[0x7fed90bdb090]
[admin:159254] *** End of error message ***
[admin:159256] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x43090)[0x7f0a07fd9090]
[admin:159256] *** End of error message ***
[admin:159255] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x43090)[0x7f5b24476090]
[admin:159255] *** End of error message ***
[admin:159252] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x43090)[0x7fcfeee05090]
[admin:159252] *** End of error message ***
/lib/x86_64-linux-gnu/libc.so.6(+0x43090)[0x7f46ae43b090]
[admin:159253] *** End of error message ***
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 4 with PID 0 on node admin exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

Question information

Language:
English Edit question
Status:
Solved
For:
ESyS-Particle Edit question
Assignee:
No assignee Edit question
Solved by:
Wang Yaqiong
Solved:
Last query:
Last reply:
Revision history for this message
Wang Yaqiong (wangyaqiong) said :
#1

sim = LsmMpi (numWorkerProcesses =5 , mpiDimList = [1,1,5])

Revision history for this message
Wang Yaqiong (wangyaqiong) said :
#2

the ESyS is the latest version installed on Ubuntu20.04 according to https://answers.launchpad.net/esys-particle/+faq/3234 .

It is worth noting that after successfully installed ESyS, I test the installation using the /src/ESyS-Particle/esys-particle-trunk/Doc/Examples/two_particle.py. Interestingly, the directory listing below can be output:
data.csv
snapshot_t=0_0.txt
snapshot_t=0_1.txt
snapshot_t=1000_0.txt
snapshot_t=1000_1.txt
snapshot_t=100_0.txt
snapshot_t=100_1.txt

However, a similar error below was output in the terminal:
Invalid MIT-MAGIC-COOKIE-1 keyCSubLatticeControler::initMPI()
slave started at local/global rank 0 / 1

That was somewhat different from the standard output form.

Revision history for this message
Wang Yaqiong (wangyaqiong) said :
#3
Revision history for this message
Wang Yaqiong (wangyaqiong) said :
#4

I found this problem is related to the FieldSaver instructions in the python script, which has not been updated in the source code. After deleting the related FieldSaver instructions, it can work normally.

Revision history for this message
Wang Yaqiong (wangyaqiong) said :
#5

I found this problem is related to the FieldSaver instructions in the python script, which has not been updated in the source code. After deleting the related FieldSaver instructions, it can work normally.