Bad DM normalization happens

Asked by seungchul

Dear all,

I got the problem during the calculation, got an error messege "Bad DM normalization: Qtot, Tr[D*S] = 1142.00000000 1141.82007575" in one of two machines.

Here is one more things. In Computer 2, it runs normally if i use "Diag.ParallelOverK T" option. (But it runs very slowly)
Do you have any comment?

Compile options are the same, but used different version of Intel compiler and mkl. (arch.make file is below)

Computer 1: (No error)
CPU: (I don't know the model, but it is 4 years old computer, so it is not "scalable CPU")
OS: CentOS 7
Compiler: Intel Parallel Studio 2019.3.199

Computer 2: "Bad DM normalization"
CPU: Dual Intel® Xeon® Scalable Processors (Gold 6130(22M Cache, 2.1 GHz, 16cores))
OS: CentOS 7
Compiler: Intel Parallel Studio 2018.5.274

======================================

1. Simulation description.
It is ZnS-ZnO slab, having pseudo hydrogen. Error occurs with and withou pseudo hydrogen.

XC.functional GGA
XC.Authors PBE
PAO.BasisSize DZP
PAO.EnergyShift 0.015 Ry
MeshCutoff 300 Ry

SlabDipoleCorrection F
SpinPolarized F
SCF.Mix Hamiltonian
SCF.Mix.First F
SCF.Mixer.Method Pulay
SCF.Mixer.Weight 0.1
SCF.Mixer.History 6
MaxSCFIterations 300
Diag.Algorithm expert
Diag.ParallelOverK F

MD.TypeOfRun CG
MD.NumCGsteps 300
MD.MaxForceTol 0.10 eV/Ang
MD.VariableCell F

NumberOfSpecies 5
%block Chemical_Species_label
1 30 Zn
2 16 S
3 8 O
4 201 H15
5 202 H05
%endblock Chemical_Species_label

%block AtomicMass
4 4.0
5 4.0
%endblock AtomicMass

%block SyntheticAtoms
4
1 2 3 4
1.5 0.0 0.0 0.0
5
1 2 3 4
0.5 0.0 0.0 0.0
%endblock SyntheticAtoms

[Atomic structure info below. ]
....

2. arch.make

arch.make file is below. arch.make files in two computers are the same, except version of intel compiler and mkl.

.SUFFIXES:
.SUFFIXES: .f .F .o .a .f90 .F90 .c
SIESTA_ARCH=intelparallelstudio_2019u3_intelmpi

CC = icc
FC=mpiifort
FC_SERIAL = ifort
FPP= $(FC) -E -P
FPP_OUTPUT=
RANLIB=ranlib

SYS=nag

SP_KIND=4
DP_KIND=8
KINDS=$(SP_KIND) $(DP_KIND)

FFLAGS=-w -O3 -heap-arrays -fPIC -static-intel
FPPFLAGS= -DMPI -DMPI_TIMING -DFC_HAVE_FLUSH -DFC_HAVE_ABORT
LDFLAGS=
FFLAGS_DEBUG= -g -O1
FFLAGS_CHECKS= -g -O0 -degug full -traceback -C

ARFLAGS_EXTRA=
FCFLAGS_fixed_f=
FCFLAGS_free_f90=
FPPFLAGS_fixed_F=
FPPFLAGS_free_F90=

MKLROOT=/usr/local/intel2019u3/compilers_and_libraries_2019.3.199/linux/mkl
BLAS_LIBS=
LAPACK_LIBS=-L$(MKLROOT)/lib/intel64_lin -lmkl_intel_lp64 -lmkl_sequential -lmkl_core
BLACS_LIBS=
SCALAPACK_LIBS=-L$(MKLROOT)/lib/intel64_lin -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64

NETCDF_LIBS=
NETCDF_INTERFACE=

FFT_INCFLAGS= -I$(MKLROOT)/include/fftw
FFT_LIBS=$(MKLROOT)/interfaces/fftw3xf/libfftw3xf_intel.a

Input file is below

Question information

Language:
English Edit question
Status:
Solved
For:
Siesta Edit question
Assignee:
No assignee Edit question
Solved by:
seungchul
Solved:
Last query:
Last reply:
Revision history for this message
seungchul (k-seungchul) said :
#1

I found that this is because Intel compiler and MPI. this version of Intel mpi is the problem.

Revision history for this message
Nick Papior (nickpapior) said :
#2

Thanks for returning with the cause of this!

Revision history for this message
Alberto Garcia (albertog) said :
#3

Was your successful run done with some other version of Intel MPI (which one?) or did you switch to other implementation?
Thank you for your feedback. We are finding a number of problems with IMPI and might benefit from some extra data points.

Revision history for this message
seungchul (k-seungchul) said :
#5

Intel 2018 update 4 and Intel 2017 update 6 cause mpi problems.
2017 was a lttile better than 2018 but not very much.

Intel 2018 also causes problem of memory. The memory usage gradually increases, not only for SIESTA but also Quantum Espresso and Vasp.

I solved this with Intel 2019 update 3. This is commercial, not free yet.
(I've got in truble with 2019u3, the memory increasing problem during MD; it was ran out of memory after ~2000 MD steps)

Intel 2016 is OK, as far as I heard from my friend.

Revision history for this message
jhsgdukydasguy (jhsgdukydasguy) said :
#6