MPI Communicator Error With PETSc in C++

Asked by L Nagy on 2013-03-04

Hello,

I'm currently playing around with FEniCS in parallel. I have a Mesh with three subdomains marked with a MeshFunction. I would like to see which parts of the mesh (and which subdomains) are on which processors after partitioning. I currently have the following small c++ code which I think should do the job:

#include <iostream>
#include <sstream>

#include <dolfin.h>
#include <petscsys.h>
#include <boost/mpi.hpp>

using namespace dolfin;
int main(int argc, char** argv) {

  Mesh mesh("mesh.xml");
  MeshFunction<size_t> meshfn(mesh, "mesh_subs.xml");

  MPICommunicator mpi_comm;
  boost::mpi::communicator comm(*mpi_comm, boost::mpi::comm_attach);

  for (CellIterator c(mesh); !c.end(); ++c) {

    std::stringstream sstr (std::stringstream::in | std::stringstream::out);
    sstr << MPI::process_number() << "\t" << meshfn[*c] << std::endl;

    PetscSynchronizedPrintf(comm, sstr.str().c_str());

  }

  PetscSynchronizedFlush(comm);

  return 0;

}

Unfortunately when I try to execute this I get the following run time error

lesleis@lesleis-virtual-machine:/mnt/hgfs/Dropbox/Documents/fenics/meshfunction/build$ mpirun -np 2 ./meshfunction
[lesleis-virtual-machine:13323] *** Process received signal ***
[lesleis-virtual-machine:13323] Signal: Segmentation fault (11)
[lesleis-virtual-machine:13323] Signal code: Address not mapped (1)
[lesleis-virtual-machine:13323] Failing at address: (nil)
[lesleis-virtual-machine:13323] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7fb98443d4a0]
[lesleis-virtual-machine:13324] *** An error occurred in MPI_Attr_get
[lesleis-virtual-machine:13324] *** on communicator MPI COMMUNICATOR 3 DUP FROM 0
[lesleis-virtual-machine:13324] *** MPI_ERR_OTHER: known error not in list
[lesleis-virtual-machine:13324] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[lesleis-virtual-machine:13323] [ 1] /lib/x86_64-linux-gnu/libc.so.6(__vfprintf_chk+0x11) [0x7fb984510361]
[lesleis-virtual-machine:13323] [ 2] /usr/lib/petscdir/3.2/linux-gnu-c-opt/lib/libpetsc.so.3.2(PetscVFPrintfDefault+0xbe) [0x7fb98564e093]
[lesleis-virtual-machine:13323] [ 3] /usr/lib/petscdir/3.2/linux-gnu-c-opt/lib/libpetsc.so.3.2(PetscSynchronizedPrintf+0x104) [0x7fb98564e3b1]
[lesleis-virtual-machine:13323] [ 4] ./meshfunction(main+0x1a7) [0x4037a7]
[lesleis-virtual-machine:13323] [ 5] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7fb98442876d]
[lesleis-virtual-machine:13323] [ 6] ./meshfunction() [0x4041fd]
[lesleis-virtual-machine:13323] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 13323 on node lesleis-virtual-machine exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

If I remove the calls to PETSc and just use cout, everything seems to work (but obviously there is a synchronisation issue to stdout), so possibly this is me not acquiring the MPI communicator in the correct manner. Could someone please let me know what I'm doing wrong or why the above code will not work?

Kind regards
Les

Question information

Language:
English Edit question
Status:
Solved
For:
FEniCS Project Edit question
Assignee:
No assignee Edit question
Solved by:
Garth Wells
Solved:
2013-03-12
Last query:
2013-03-12
Last reply:
2013-03-05
Best Garth Wells (garth-wells) said : #1

PETSc has probably no been initialise because you're not creating any PETSc objects. Try putting

    SubSystemsManager ::init_petsc();

at the top of your function to explicitly initialise PETSc.

L Nagy (l-nagy) said : #2

Thanks Garth Wells, that solved my question.