tbtrans segfault

Asked by Tim Roger on 2017-03-02

Hello,

I try to compile siesta with netcdf and hdf5. When I've compiled it I've noticed my tbtrans code segfaults in some cases. So I adjust arch.make add: -c -O0 -g -check bounds -traceback -xHost. I've taken examples from Tests/TranSiesta-TBTrans run run_tests.sh and I've got:
==> Running bulk_au_111 with tbtrans=mpirun -np 24 /mnt/sdb1/scanya/siesta-4.0/Util/TBTrans/tbtrans
forrtl: severe (408): fort: (2): Subscript #2 of the array KXY has value 2 which is greater than the upper bound of 1

Image PC Routine Line Source
libnetcdff.so.6 00007FB3EC93E693 Unknown Unknown Unknown
tbtrans 0000000000425943 MAIN__ 1074 tbtrans.F
tbtrans 000000000040643E Unknown Unknown Unknown
libc.so.6 0000003EAB61ED1D Unknown Unknown Unknown
tbtrans 0000000000406329 Unknown Unknown Unknown
The scattering region calculation did not go well ...
 **** Test ts_au did not complete successfully

I've compiled it on Scientific Linux CERN SLC release 6.8 (Carbon) with
ifort (IFORT) 15.0.3 20150407
hdf5/1.8.18_impi-5.0.3_icc-15.0.3
netcdf/4.4.1.1_impi-5.0.3_icc-15.0.3

I can't figure out where is the problem with it.
---
Regards,
Tim

Question information

Language:
English Edit question
Status:
Solved
For:
Siesta Edit question
Assignee:
No assignee Edit question
Solved by:
Nick Papior
Solved:
2017-03-06
Last query:
2017-03-06
Last reply:
2017-03-06
Nick Papior (nickpapior) said : #1

In 4.0 there are two tbtrans versions:
-TBtrans
-TBtrans_rep

The former is the older version of tbtrans with parallellization over k-points.
The latter is the newer version of tbtrans with parallellization over E-points. This version has some more features than the former one.

None of the tbtrans versions in 4.0 implements anything related to NetCDF.

In 4.1 and beyond there is only one tbtrans with MANY new features, as well as complete NetCDF support, in this version you are highly recommended to install with NetCDF (and -DNCDF_4 in FPPFLAGS for complete functionality). Although 4.1 is still in beta it may be worthwhile testing as well. Note that 4.1 transiesta and tbtrans are completely re-written, so all fdf-flags have changed.

I would recommend you try the 4.0/TBTrans_rep version first.

Tim Roger (tim12) said : #2

Hello Nick,

Thank you for the answer. Is there any in siesta-4.0 bunch of reference tests for TBTrans_rep? I would like to confirm that my compilation is fine. I look through TranSiesta-TBTrans directory but there are only tests which invokes TBTrans not TBTrans_rep. Are this tests ok for TBtrans_rep?
---
Regards,
Tim

Nick Papior (nickpapior) said : #3

Yes. They should be compatible.

Tim Roger (tim12) said : #4

Thank you. It looks tbtrans_rep works fine. Do you have any reference benchmark tests for it? I would like to know if my compilation is ok or it could run faster.

Nick Papior (nickpapior) said : #5

I have to say that benchmarks to check up against is near impossible. It requires _exactly_ the same architecture and compilers to check whether you have compiled with good optimizations etc.

I would advice you to refer to your compiler on how to make suitable optimizations for your architecture, without reducing accuracy (very important).

I.e. your inquiry is beyond the scope of siesta developers, refer to compiler vendors for this.

Tim Roger (tim12) said : #6

Do you have any internal tests? I.E. siesta build for x86 on xeon (processor model) compiled with intel icc (compiler version), mpicc and input file? I would like to know if my compilation is optimal or if I could speed it up. Where could I look for it?

Best Nick Papior (nickpapior) said : #7

As I said (#5), benchmarks does not make sense unless it is exactly the same architecture.

Look up performance flags for your compiler vendor, reasonable flags for intel are:
-O2 -m64 -xHost -prec-div -prec-sqrt -opt-prefetch

There are many more flags to tune, but these seem reasonable.

Tim Roger (tim12) said : #8

When I try to compile siesta and transiesta with -O2 flag and mpiifort it always segfaults. Only -O0 works fine. Thank you for your help and patience.

Tim Roger (tim12) said : #9

Thanks Nick Papior, that solved my question.

Nick Papior (nickpapior) said : #10

Intel is really making aggressive optimizations, a segfault with intel compilers in siesta/transiesta are probably due to atom.F being compiled with too high optimization.
Use less optimizations on that file (see e.g. the intel.make file in siesta-4.1-b2 releases).