MPI Problem

Asked by Pietro Maximoff on 2010-06-30

Hi

I get the following error whenever I try to use MPI. It runs fine with 1 processor.

I'm not sure why the assertion,
    'V.dofmap().global_dimension() <= x.size()' fails

What could the problem be?

Many thanks

Pietro

====
Process 0: Partitioned mesh, edge cut is 2224.
Process 1: Partitioned mesh, edge cut is 2224.
domain: dolfin/function/Function.cpp:61: dolfin::Function::Function(const dolfin::FunctionSpace&, dolfin::GenericVector&): Assertion `V.dofmap().global_dimension() <= x.size()' failed.
[node:22085] *** Process received signal ***
[node:22085] Signal: Aborted (6)
[node:22085] Signal code: (-6)
domain: dolfin/function/Function.cpp:61: dolfin::Function::Function(const dolfin::FunctionSpace&, dolfin::GenericVector&): Assertion `V.dofmap().global_dimension() <= x.size()' failed.
[node:22086] *** Process received signal ***
[node:22086] Signal: Aborted (6)
[node:22086] Signal code: (-6)
[node:22085] [ 0] /lib/libpthread.so.0(+0xf8f0) [0x7fa5844af8f0]
[node:22085] [ 1] /lib/libc.so.6(gsignal+0x35) [0x7fa584151a75]
[node:22085] [ 2] /lib/libc.so.6(abort+0x180) [0x7fa5841555c0]
[node:22085] [ 3] /lib/libc.so.6(__assert_fail+0xf1) [0x7fa58414a941]
[node:22085] [ 4] /home/pietro/FEniCS/lib/libdolfin.so.0(_ZN6dolfin8FunctionC1ERKNS_13FunctionSpaceERNS_13GenericVectorE+0x1e5) [0x7fa58c553885]
[node:22085] [ 5] ./domain(main+0xfd9) [0x4187b9]
[node:22085] [ 6] /lib/libc.so.6(__libc_start_main+0xfd) [0x7fa58413cc4d]
[node:22085] [ 7] ./domain() [0x4109b9]
[node:22085] *** End of error message ***
[node:22086] [ 0] /lib/libpthread.so.0(+0xf8f0) [0x7f863d8168f0]
[node:22086] [ 1] /lib/libc.so.6(gsignal+0x35) [0x7f863d4b8a75]
[node:22086] [ 2] /lib/libc.so.6(abort+0x180) [0x7f863d4bc5c0]
[node:22086] [ 3] /lib/libc.so.6(__assert_fail+0xf1) [0x7f863d4b1941]
[node:22086] [ 4] /home/pietro/FEniCS/lib/libdolfin.so.0(_ZN6dolfin8FunctionC1ERKNS_13FunctionSpaceERNS_13GenericVectorE+0x1e5) [0x7f86458ba885]
[node:22086] [ 5] ./domain(main+0xfd9) [0x4187b9]
[node:22086] [ 6] /lib/libc.so.6(__libc_start_main+0xfd) [0x7f863d4a3c4d]
[node:22086] [ 7] ./domain() [0x4109b9]
[node:22086] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 22086 on node node exited on signal 6 (Aborted).

Question information

Language:
English Edit question
Status:
Solved
For:
DOLFIN Edit question
Assignee:
No assignee Edit question
Solved by:
Garth Wells
Solved:
2010-08-15
Last query:
2010-08-15
Last reply:
2010-08-14
Pietro Maximoff (segment-x) said : #1

Hi

Could someone please help have a look at this problem. It may seem like a stupid question but I really can't see why this problem occurs. I assume that if the:
  Assertion `V.dofmap().global_dimension() <= x.size()'
passes on one processor, I see no reason why it should fail on 'n' processors.

Many thanks

Pietro

Johannes Ring (johannr) said : #2

Can you please provide a simple test program where this error occurs?

Which DOLFIN version do you have?

Garth Wells (garth-wells) said : #3

On 01/07/10 12:44, Pietro Maximoff wrote:
> Question #116222 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/116222
>
> Pietro Maximoff posted a new comment:
> Hi
>
> Could someone please help have a look at this problem. It may seem like a stupid question but I really can't see why this problem occurs. I assume that if the:
> Assertion `V.dofmap().global_dimension()<= x.size()'
> passes on one processor, I see no reason why it should fail on 'n' processors.
>

You need to describe your solver in more detail. There are a variety of
issues which may trigger this error.

Garth

> Many thanks
>
> Pietro
>

Pietro Maximoff (segment-x) said : #4

Below's a simple test program. All it does is create a rectangular mesh and colours it. Any 2D UFL form file can be used for Test.h as the test program doesn't solve any PDE problem.

I'm using DOLFIN version 0.9.7 .

Many thanks

Pietro

========================================

#include <dolfin.h>
#include "Test.h"

using namespace dolfin;

int main()
{
 const int dim = 2; // dimension

 // Create mesh

 Rectangle mesh(0.0, 0.0, 1.0, 1.0, 10, 10);

 // Create function space
 Test::FunctionSpace V(mesh);

 // Set up forms
 Test::BilinearForm a(V, V);
 Test::LinearForm L(V);

 // Access mesh geometry
 MeshGeometry& geometry = mesh.geometry();

 // Associate a vertex to a specific cell
 double x0, x1;
 Array<double> *xy;

 Vector volt(mesh.num_vertices());
 double block[mesh.num_vertices()];
 dolfin::uint rows[mesh.num_vertices()];

 int i = 0;
  for (VertexIterator vertex(mesh); !vertex.end(); ++vertex) {
  xy = new Array<double>(dim, geometry.x(vertex->index()));
  x0 = (*xy)[0];
  x1 = (*xy)[1];

  if (x0 <= 0.4)
     block[i] = 40;
  else
    block[i] = -86;

  ++i;
 }

 // Indices of the row of the vector to set
 for (unsigned int i = 0; i < mesh.num_vertices(); ++i)
            rows[i] = i;

 // Output file in VTK format
 File file("test_out.pvd");

 volt.set(block, mesh.num_vertices(), rows);
 Function u(V, volt);

 file << u;

 plot(u);
}

=================================================

Niclas Jansson (njansson) said : #5

Pietro Maximoff <email address hidden> writes:

> Question #116222 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/116222
>
> Pietro Maximoff posted a new comment:
> Below's a simple test program. All it does is create a rectangular mesh
> and colours it. Any 2D UFL form file can be used for Test.h as the test
> program doesn't solve any PDE problem.
>
> I'm using DOLFIN version 0.9.7 .
>
> Many thanks
>
> Pietro
>

If I remember correctly, Vector v(N) creates a vector of global size
N. but mesh.num_vertices() returns the size of the local mesh.

Niclas

> ========================================
>
> #include <dolfin.h>
> #include "Test.h"
>
> using namespace dolfin;
>
> int main()
> {
> const int dim = 2; // dimension
>
> // Create mesh
>
> Rectangle mesh(0.0, 0.0, 1.0, 1.0, 10, 10);
>
> // Create function space
> Test::FunctionSpace V(mesh);
>
> // Set up forms
> Test::BilinearForm a(V, V);
> Test::LinearForm L(V);
>
> // Access mesh geometry
> MeshGeometry& geometry = mesh.geometry();
>
> // Associate a vertex to a specific cell
> double x0, x1;
> Array<double> *xy;
>
> Vector volt(mesh.num_vertices());
> double block[mesh.num_vertices()];
> dolfin::uint rows[mesh.num_vertices()];
>
> int i = 0;
> for (VertexIterator vertex(mesh); !vertex.end(); ++vertex) {
> xy = new Array<double>(dim, geometry.x(vertex->index()));
> x0 = (*xy)[0];
> x1 = (*xy)[1];
>
> if (x0 <= 0.4)
> block[i] = 40;
> else
> block[i] = -86;
>
> ++i;
> }
>
> // Indices of the row of the vector to set
> for (unsigned int i = 0; i < mesh.num_vertices(); ++i)
> rows[i] = i;
>
> // Output file in VTK format
> File file("test_out.pvd");
>
> volt.set(block, mesh.num_vertices(), rows);
> Function u(V, volt);
>
> file << u;
>
> plot(u);
> }
>
> =================================================
>
> --
> You received this question notification because you are a member of
> DOLFIN Team, which is an answer contact for DOLFIN.
>
> _______________________________________________
> Mailing list: https://launchpad.net/~dolfin
> Post to : <email address hidden>
> Unsubscribe : https://launchpad.net/~dolfin
> More help : https://help.launchpad.net/ListHelp

Pietro Maximoff (segment-x) said : #6

--- On Thu, 7/1/10, Niclas Jansson <email address hidden> wrote:If I remember correctly, Vector v(N) creates a vector of global size
N. but mesh.num_vertices() returns the size of the local mesh.

Niclas

But in this instance,     mesh.num_vertices() == V.dofmap().dofs(mesh).size();
is true;
Pietro

Pietro Maximoff (segment-x) said : #7

Running the same program on Mac OS X, the error produced is:

Assertion failed: (index < N), function index_owner, file dolfin/main/MPI.cpp, line 486.
[Gambit-3:07240] *** Process received signal ***
[Gambit-3:07240] Signal: Abort trap (6)
[Gambit-3:07240] Signal code: (0)
Assertion failed: (index < N), function index_owner, file dolfin/main/MPI.cpp, line 486.
[Gambit-3:07241] *** Process received signal ***
[Gambit-3:07241] Signal: Abort trap (6)
[Gambit-3:07241] Signal code: (0)
[Gambit-3:07241] [ 0] 2 libSystem.B.dylib 0x00007fff832c335a _sigtramp + 26
[Gambit-3:07241] [ 1] 3 libSystem.B.dylib 0x00007fff832680aa tiny_malloc_from_free_list + 1196
[Gambit-3:07241] [ 2] 4 libSystem.B.dylib 0x00007fff8333e9b4 __pthread_markcancel + 0
[Gambit-3:07241] [ 3] 5 libdolfin.dylib 0x00000001001ac9c0 _ZN6dolfin3MPI10distributeERSt6vectorIjSaIjEES4_ + 0
[Gambit-3:07241] [ 4] 6 libdolfin.dylib 0x000000010016cb1d _ZN6dolfin15SparsityPattern5applyEv + 397
[Gambit-3:07241] [ 5] 7 libdolfin.dylib 0x00000001000dbda9 _ZN6dolfin22SparsityPatternBuilder5buildERNS_22GenericSparsityPatternERKNS_4MeshERSt6vectorIPKNS_13GenericDofMapESaIS9_EEbb + 1593
[Gambit-3:07241] [ 6] 8 libdolfin.dylib 0x00000001000af867 _ZN6dolfin14AssemblerTools18init_global_tensorERNS_13GenericTensorERKNS_4FormERNS_3UFCEbb + 743
[Gambit-3:07241] [ 7] 9 libdolfin.dylib 0x00000001000ad7d7 _ZN6dolfin9Assembler8assembleERNS_13GenericTensorERKNS_4FormEPKNS_12MeshFunctionIjEES9_S9_bb + 215
[Gambit-3:07241] [ 8] 10 libdolfin.dylib 0x00000001000ae356 _ZN6dolfin9Assembler8assembleERNS_13GenericTensorERKNS_4FormEbb + 134
[Gambit-3:07241] [ 9] 11 mono 0x000000010000b77d main + 5037
[Gambit-3:07241] [10] 12 mono 0x0000000100001bc4 start + 52
[Gambit-3:07241] [11] 13 ??? 0x0000000000000001 0x0 + 1
[Gambit-3:07241] *** End of error message ***
[Gambit-3:07240] [ 0] 2 libSystem.B.dylib 0x00007fff832c335a _sigtramp + 26
[Gambit-3:07240] [ 1] 3 libSystem.B.dylib 0x00007fff832680aa tiny_malloc_from_free_list + 1196
[Gambit-3:07240] [ 2] 4 libSystem.B.dylib 0x00007fff8333e9b4 __pthread_markcancel + 0
[Gambit-3:07240] [ 3] 5 libdolfin.dylib 0x00000001001ac9c0 _ZN6dolfin3MPI10distributeERSt6vectorIjSaIjEES4_ + 0
[Gambit-3:07240] [ 4] 6 libdolfin.dylib 0x000000010016cb1d _ZN6dolfin15SparsityPattern5applyEv + 397
[Gambit-3:07240] [ 5] 7 libdolfin.dylib 0x00000001000dbda9 _ZN6dolfin22SparsityPatternBuilder5buildERNS_22GenericSparsityPatternERKNS_4MeshERSt6vectorIPKNS_13GenericDofMapESaIS9_EEbb + 1593
[Gambit-3:07240] [ 6] 8 libdolfin.dylib 0x00000001000af867 _ZN6dolfin14AssemblerTools18init_global_tensorERNS_13GenericTensorERKNS_4FormERNS_3UFCEbb + 743
[Gambit-3:07240] [ 7] 9 libdolfin.dylib 0x00000001000ad7d7 _ZN6dolfin9Assembler8assembleERNS_13GenericTensorERKNS_4FormEPKNS_12MeshFunctionIjEES9_S9_bb + 215
[Gambit-3:07240] [ 8] 10 libdolfin.dylib 0x00000001000ae356 _ZN6dolfin9Assembler8assembleERNS_13GenericTensorERKNS_4FormEbb + 134
[Gambit-3:07240] [ 9] 11 mono 0x000000010000b77d main + 5037
[Gambit-3:07240] [10] 12 mono 0x0000000100001bc4 start + 52
[Gambit-3:07240] [11] 13 ??? 0x0000000000000001 0x0 + 1
[Gambit-3:07240] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 7241 on node Gambit-3.local exited on signal 6 (Abort trap).

========================================================

Lines 484 - 486 of dolfin/main/MPI.cpp are:

dolfin::uint dolfin::MPI::index_owner(uint index, uint N)
{
  assert(index < N);
  ......

I can't see why this assertion fails.

Pietro

Garth Wells (garth-wells) said : #8

On 01/07/10 13:16, Pietro Maximoff wrote:
> Question #116222 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/116222
>
> Pietro Maximoff posted a new comment:
> Below's a simple test program. All it does is create a rectangular mesh
> and colours it. Any 2D UFL form file can be used for Test.h as the test
> program doesn't solve any PDE problem.
>
> I'm using DOLFIN version 0.9.7 .
>

You need to make the program as simple as possible - right back to the
trivial case that the code does nothing if necessary - and build it up
until you get the error.

Garth

> Many thanks
>
> Pietro
>
> ========================================
>
> #include<dolfin.h>
> #include "Test.h"
>
> using namespace dolfin;
>
> int main()
> {
> const int dim = 2; // dimension
>
> // Create mesh
>
> Rectangle mesh(0.0, 0.0, 1.0, 1.0, 10, 10);
>
> // Create function space
> Test::FunctionSpace V(mesh);
>
> // Set up forms
> Test::BilinearForm a(V, V);
> Test::LinearForm L(V);
>
> // Access mesh geometry
> MeshGeometry& geometry = mesh.geometry();
>
> // Associate a vertex to a specific cell
> double x0, x1;
> Array<double> *xy;
>
> Vector volt(mesh.num_vertices());
> double block[mesh.num_vertices()];
> dolfin::uint rows[mesh.num_vertices()];
>
> int i = 0;
> for (VertexIterator vertex(mesh); !vertex.end(); ++vertex) {
> xy = new Array<double>(dim, geometry.x(vertex->index()));
> x0 = (*xy)[0];
> x1 = (*xy)[1];
>
> if (x0<= 0.4)
> block[i] = 40;
> else
> block[i] = -86;
>
> ++i;
> }
>
> // Indices of the row of the vector to set
> for (unsigned int i = 0; i< mesh.num_vertices(); ++i)
> rows[i] = i;
>
> // Output file in VTK format
> File file("test_out.pvd");
>
> volt.set(block, mesh.num_vertices(), rows);
> Function u(V, volt);
>
> file<< u;
>
> plot(u);
> }
>
> =================================================
>

Niclas Jansson (njansson) said : #9

Yes, but the assert was triggerd by the global dimension of the function space , not the size of the dofmap on the mesh.

Create the vector with a size equal to the global (processor wise) dimension.

----- Reply message -----
From: "Pietro Maximoff" <email address hidden>
Date: Thu, Jul 1, 2010 15:30
Subject: [Dolfin] [Question #116222]: MPI Problem
To: <email address hidden>

Question #116222 on DOLFIN changed:
https://answers.launchpad.net/dolfin/+question/116222

    Status: Answered => Open

Pietro Maximoff is still having a problem:
--- On Thu, 7/1/10, Niclas Jansson <email address hidden> wrote:If I remember correctly, Vector v(N) creates a vector of global size
N. but mesh.num_vertices() returns the size of the local mesh.

Niclas

But in this instance,     mesh.num_vertices() == V.dofmap().dofs(mesh).size();
is true;
Pietro

You received this question notification because you are a member of
DOLFIN Team, which is an answer contact for DOLFIN.

_______________________________________________
Mailing list: https://launchpad.net/~dolfin
Post to : <email address hidden>
Unsubscribe : https://launchpad.net/~dolfin
More help : https://help.launchpad.net/ListHelp

Pietro Maximoff (segment-x) said : #10

Niclas:

How would I go about doing this, i.e., creating the vector with a size equal to the global (processor wise) dimension.

Garth:

I've done what you suggested. Turns out the problem occurs when I create the function from the Vector, i.e.,

         Function u(V, volt);

So,
        volt.set(block, mesh.num_vertices(), rows); //<-----------------------NO PROBLEM
        Function u(V, volt); // CRASHES!!!

How can I fix this?

Many thanks

Pietro

Pietro Maximoff (segment-x) said : #11

Hi

Could someone help have a look at this problem. I've tried the suggestions and I'm still unable to rectify the problem. If it's supposed to be easy, unfortunately it's more than trivial for me.

The MPI problem I believe comes from setting the individual components of a Vector and then creating a Function from that. So, alternatively, if there's a better method (that's MPI-compatible) to gather components into a Vector and then create a Function, please let me know.

As I said before, this works fine on 1 processor but it takes a 'while'.

Many thanks

Pietro

Garth Wells (garth-wells) said : #12

V.dofmap().dofs(mesh).size();

and

V.dofmap().global_dimension()

are not the same thing. You need to take more care in creating vectors with the correct dimension.

Pietro Maximoff (segment-x) said : #13

Thanks Garth, creating the vectors with V.dofmap().global_dimension() does solve the MPI problem. Apologies for not acknowledging the reply much earlier.

Is there a function/way to associate with each member of V.dofmap().global_dimension(), its coordinate, e.g., like a function that gives the coordinates of the support points?

Thanks

Pietro

Best Garth Wells (garth-wells) said : #14

On Sat, 2010-08-14 at 22:16 +0000, Pietro Maximoff wrote:
> Question #116222 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/116222
>
> Pietro Maximoff posted a new comment:
> Thanks Garth, creating the vectors with V.dofmap().global_dimension()
> does solve the MPI problem. Apologies for not acknowledging the reply
> much earlier.
>
> Is there a function/way to associate with each member of
> V.dofmap().global_dimension(), its coordinate, e.g., like a function
> that gives the coordinates of the support points?
>

Yes. Take a look at DofMap::tabulate_coordinates.

Garth

> Thanks
>
> Pietro
>

Pietro Maximoff (segment-x) said : #15

Thanks.

Pietro Maximoff (segment-x) said : #16

Thanks Garth Wells, that solved my question.

Pietro Maximoff (segment-x) said : #17
Pietro Maximoff (segment-x) said : #20

Great quality http://save.ajsealy.com <http://save.ajsealy.com/>

Pietro Maximoff