parallel computating on c++ DOLFIN 1.0

Asked by Nguyen Van Dang

Hello,
I would like to use the parallel computing with 4 processors on my laptop. I am using C++ DOLFIN 1.0.
My code:
int main()

{

    parameters["num_threads"] = 4;

    ....
    printf("Initial position: %d, x=%f, y=%f\n",initial_pos,mesh.geometry().x(initial_pos,0),mesh.geometry().x(initial_pos,1));

    Function Ini(V);

    Ini.vector().setitem(initial_pos,1.0);

    Signal::Functional signal(mesh, Ini);

    double signal_value = assemble(signal);

    printf("The signal of v before normolizing is: %.15g\n", signal_value);

    Ini.vector().setitem(initial_pos,1.0/signal_value);

    signal_value = assemble(signal);

    printf("The signal of v after normolizing is: %.15g\n", signal_value);

   ........
}
where "Signal.ufl" has the contents

element = FiniteElement("Lagrange", "triangle", 1)
v = Coefficient(element)
M = v*dx

errors:
Initial position: 480, x=0.000000, y=0.000000
*** Warning: Form::coloring does not properly consider form type.
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Signal received!
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 5, Mon Sep 27 11:51:54 CDT 2010
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Unknown Name on a linux-gnu named ubuntu by nguyenvandang Mon Oct 3 05:10:37 2011
[0]PETSC ERROR: Libraries linked from /build/buildd/petsc-3.1.dfsg/linux-gnu-c-opt/lib
[0]PETSC ERROR: Configure run at Mon Mar 7 18:34:33 2011
[0]PETSC ERROR: Configure options --with-shared --with-debugging=0 --useThreads 0 --with-clanguage=C++ --with-c-support --with-fortran-interfaces=1 --with-mpi-dir=/usr/lib/openmpi --with-mpi-shared=1 --with-blas-lib=-lblas --with-lapack-lib=-llapack --with-blacs=1 --with-blacs-include=/usr/include --with-blacs-lib="[/usr/lib/libblacsCinit-openmpi.so,/usr/lib/libblacs-openmpi.so]" --with-scalapack=1 --with-scalapack-include=/usr/include --with-scalapack-lib=/usr/lib/libscalapack-openmpi.so --with-mumps=1 --with-mumps-include=/usr/include --with-mumps-lib="[/usr/lib/libdmumps.so,/usr/lib/libzmumps.so,/usr/lib/libsmumps.so,/usr/lib/libcmumps.so,/usr/lib/libmumps_common.so,/usr/lib/libpord.so]" --with-umfpack=1 --with-umfpack-include=/usr/include/suitesparse --with-umfpack-lib="[/usr/lib/libumfpack.so,/usr/lib/libamd.so]" --with-spooles=1 --with-spooles-include=/usr/include/spooles --with-spooles-lib=/usr/lib/libspooles.so --with-hypre=1 --with-hypre-dir=/usr --with-scotch=1 --with-scotch-include=/usr/include/scotch --with-scotch-lib=/usr/lib/libscotch.so --with-hdf5=1 --with-hdf5-dir=/usr
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 59.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

When I closed two lines related to "signal_value = assemble(signal);", it worked. Can you help me to fix this problem?
Thanks in advance,
Nguyen Van Dang

Question information

Language:
English Edit question
Status:
Answered
For:
DOLFIN Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Johan Hake (johan-hake) said :
#1

The default coloring algorithm is set to "Boost". For large meshes this can be
troublsome. If you have compiled DOLFIN with Trilinos you could try the
"Zoltan" method.

  parameters["graph_coloring_library"] = "Zoltan";

Johan

On Monday October 3 2011 05:25:50 Nguyen Van Dang wrote:
> New question #173088 on DOLFIN:
> https://answers.launchpad.net/dolfin/+question/173088
>
> Hello,
> I would like to use the parallel computing with 4 processors on my laptop.
> I am using C++ DOLFIN 1.0. My code:
> int main()
>
> {
>
> parameters["num_threads"] = 4;
>
> ....
> printf("Initial position: %d, x=%f,
> y=%f\n",initial_pos,mesh.geometry().x(initial_pos,0),mesh.geometry().x(ini
> tial_pos,1));
>
> Function Ini(V);
>
> Ini.vector().setitem(initial_pos,1.0);
>
> Signal::Functional signal(mesh, Ini);
>
> double signal_value = assemble(signal);
>
> printf("The signal of v before normolizing is: %.15g\n", signal_value);
>
>
>
> Ini.vector().setitem(initial_pos,1.0/signal_value);
>
> signal_value = assemble(signal);
>
> printf("The signal of v after normolizing is: %.15g\n", signal_value);
>
> ........
> }
> where "Signal.ufl" has the contents
>
> element = FiniteElement("Lagrange", "triangle", 1)
> v = Coefficient(element)
> M = v*dx
>
> errors:
> Initial position: 480, x=0.000000, y=0.000000
> *** Warning: Form::coloring does not properly consider form type.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range [0]PETSC ERROR: Try option
> -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#S
> ignal[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors [0]PETSC ERROR: configure using
> --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more
> information on the crash.
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------ [0]PETSC ERROR: Signal received!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 5, Mon Sep 27 11:51:54
> CDT 2010 [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Unknown Name on a linux-gnu named ubuntu by nguyenvandang
> Mon Oct 3 05:10:37 2011 [0]PETSC ERROR: Libraries linked from
> /build/buildd/petsc-3.1.dfsg/linux-gnu-c-opt/lib [0]PETSC ERROR: Configure
> run at Mon Mar 7 18:34:33 2011
> [0]PETSC ERROR: Configure options --with-shared --with-debugging=0
> --useThreads 0 --with-clanguage=C++ --with-c-support
> --with-fortran-interfaces=1 --with-mpi-dir=/usr/lib/openmpi
> --with-mpi-shared=1 --with-blas-lib=-lblas --with-lapack-lib=-llapack
> --with-blacs=1 --with-blacs-include=/usr/include
> --with-blacs-lib="[/usr/lib/libblacsCinit-openmpi.so,/usr/lib/libblacs-ope
> nmpi.so]" --with-scalapack=1 --with-scalapack-include=/usr/include
> --with-scalapack-lib=/usr/lib/libscalapack-openmpi.so --with-mumps=1
> --with-mumps-include=/usr/include
> --with-mumps-lib="[/usr/lib/libdmumps.so,/usr/lib/libzmumps.so,/usr/lib/li
> bsmumps.so,/usr/lib/libcmumps.so,/usr/lib/libmumps_common.so,/usr/lib/libpo
> rd.so]" --with-umfpack=1 --with-umfpack-include=/usr/include/suitesparse
> --with-umfpack-lib="[/usr/lib/libumfpack.so,/usr/lib/libamd.so]"
> --with-spooles=1 --with-spooles-include=/usr/include/spooles
> --with-spooles-lib=/usr/lib/libspooles.so --with-hypre=1
> --with-hypre-dir=/usr --with-scotch=1
> --with-scotch-include=/usr/include/scotch
> --with-scotch-lib=/usr/lib/libscotch.so --with-hdf5=1 --with-hdf5-dir=/usr
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode 59.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
>
> When I closed two lines related to "signal_value = assemble(signal);", it
> worked. Can you help me to fix this problem? Thanks in advance,
> Nguyen Van Dang

Revision history for this message
Nguyen Van Dang (dang-1032170) said :
#2

Thanks Johan. I replaced
parameters["num_threads"] = 4;
 by
parameters["graph_coloring_library"] = "Zoltan";
It worked. However, this method didn't improve the running time. Can you tell me if there is something wrong?
Thanks.
Nguyen Van Dang

Revision history for this message
Johan Hake (johan-hake) said :
#3

On Monday October 3 2011 13:41:06 Nguyen Van Dang wrote:
> Question #173088 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/173088
>
> Status: Answered => Open
>
> Nguyen Van Dang is still having a problem:
> Thanks Johan. I replaced
> parameters["num_threads"] = 4;
> by
> parameters["graph_coloring_library"] = "Zoltan";
> It worked.

Well... you do not know that. By setting the number of threads to 0 you
disable the color partitioning of the cells.

> However, this method didn't improve the running time. Can you
> tell me if there is something wrong?

keep:

  parameters["num_threads"] = 4;

while adding:

  parameters["graph_coloring_library"] = "Zoltan";

Johan

> Thanks.
> Nguyen Van Dang

Revision history for this message
Nguyen Van Dang (dang-1032170) said :
#4

Hi Johan,
In fact, I haven't used it before. Now, I keep both of them. When I set
parameters["num_threads"] = 0;

parameters["graph_coloring_library"] = "Zoltan";
it worked.
When I set
parameters["num_threads"] = 1;
parameters["graph_coloring_library"] = "Zoltan";
I got errors again. Please show me what I need do.
Thank you very much for your help.
Best regards
Nguyen Van Dang

 *** Warning: Form::coloring does not properly consider form type.
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Signal received!
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 5, Mon Sep 27 11:51:54 CDT 2010
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Unknown Name on a linux-gnu named ubuntu by nguyenvandang Mon Oct 3 12:44:01 2011
[0]PETSC ERROR: Libraries linked from /build/buildd/petsc-3.1.dfsg/linux-gnu-c-opt/lib
[0]PETSC ERROR: Configure run at Mon Mar 7 18:34:33 2011
[0]PETSC ERROR: Configure options --with-shared --with-debugging=0 --useThreads 0 --with-clanguage=C++ --with-c-support --with-fortran-interfaces=1 --with-mpi-dir=/usr/lib/openmpi --with-mpi-shared=1 --with-blas-lib=-lblas --with-lapack-lib=-llapack --with-blacs=1 --with-blacs-include=/usr/include --with-blacs-lib="[/usr/lib/libblacsCinit-openmpi.so,/usr/lib/libblacs-openmpi.so]" --with-scalapack=1 --with-scalapack-include=/usr/include --with-scalapack-lib=/usr/lib/libscalapack-openmpi.so --with-mumps=1 --with-mumps-include=/usr/include --with-mumps-lib="[/usr/lib/libdmumps.so,/usr/lib/libzmumps.so,/usr/lib/libsmumps.so,/usr/lib/libcmumps.so,/usr/lib/libmumps_common.so,/usr/lib/libpord.so]" --with-umfpack=1 --with-umfpack-include=/usr/include/suitesparse --with-umfpack-lib="[/usr/lib/libumfpack.so,/usr/lib/libamd.so]" --with-spooles=1 --with-spooles-include=/usr/include/spooles --with-spooles-lib=/usr/lib/libspooles.so --with-hypre=1 --with-hypre-dir=/usr --with-scotch=1 --with-scotch-include=/usr/include/scotch --with-scotch-lib=/usr/lib/libscotch.so --with-hdf5=1 --with-hdf5-dir=/usr
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 59.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

Revision history for this message
Johan Hake (johan-hake) said :
#5

On Monday October 3 2011 21:25:49 Nguyen Van Dang wrote:
> Question #173088 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/173088
>
> Status: Answered => Open
>
> Nguyen Van Dang is still having a problem:
> Hi Johan,
> In fact, I haven't used it before. Now, I keep both of them. When I set
> parameters["num_threads"] = 0;
>
> parameters["graph_coloring_library"] = "Zoltan";
> it worked.

It is not crashing, because it is not using the graph coloring. When you set

  parameters["num_threads"] = 1;

you trigger the graph coloring, which eventually causes the crashe, I guess.

> When I set
> parameters["num_threads"] = 1;
> parameters["graph_coloring_library"] = "Zoltan";
> I got errors again.

Shared memory parallelism (via OpenMP) is still experimental in DOLFIN. That
said I use it and get pretty nice speedups. How large is your mesh, is it
possible you can upload your mesh somewhere for us to try?

Johan

> Please show me what I need do.
> Thank you very much for your help.
> Best regards
> Nguyen Van Dang
>
> *** Warning: Form::coloring does not properly consider form type.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range [0]PETSC ERROR: Try option
> -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#S
> ignal[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors [0]PETSC ERROR: configure using
> --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more
> information on the crash.
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------ [0]PETSC ERROR: Signal received!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 5, Mon Sep 27 11:51:54
> CDT 2010 [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Unknown Name on a linux-gnu named ubuntu by nguyenvandang
> Mon Oct 3 12:44:01 2011 [0]PETSC ERROR: Libraries linked from
> /build/buildd/petsc-3.1.dfsg/linux-gnu-c-opt/lib [0]PETSC ERROR: Configure
> run at Mon Mar 7 18:34:33 2011
> [0]PETSC ERROR: Configure options --with-shared --with-debugging=0
> --useThreads 0 --with-clanguage=C++ --with-c-support
> --with-fortran-interfaces=1 --with-mpi-dir=/usr/lib/openmpi
> --with-mpi-shared=1 --with-blas-lib=-lblas --with-lapack-lib=-llapack
> --with-blacs=1 --with-blacs-include=/usr/include
> --with-blacs-lib="[/usr/lib/libblacsCinit-openmpi.so,/usr/lib/libblacs-ope
> nmpi.so]" --with-scalapack=1 --with-scalapack-include=/usr/include
> --with-scalapack-lib=/usr/lib/libscalapack-openmpi.so --with-mumps=1
> --with-mumps-include=/usr/include
> --with-mumps-lib="[/usr/lib/libdmumps.so,/usr/lib/libzmumps.so,/usr/lib/li
> bsmumps.so,/usr/lib/libcmumps.so,/usr/lib/libmumps_common.so,/usr/lib/libpo
> rd.so]" --with-umfpack=1 --with-umfpack-include=/usr/include/suitesparse
> --with-umfpack-lib="[/usr/lib/libumfpack.so,/usr/lib/libamd.so]"
> --with-spooles=1 --with-spooles-include=/usr/include/spooles
> --with-spooles-lib=/usr/lib/libspooles.so --with-hypre=1
> --with-hypre-dir=/usr --with-scotch=1
> --with-scotch-include=/usr/include/scotch
> --with-scotch-lib=/usr/lib/libscotch.so --with-hdf5=1 --with-hdf5-dir=/usr
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode 59.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------

Revision history for this message
Nguyen Van Dang (dang-1032170) said :
#6

Hi Johan,
Here is the link of my files: http://www.mediafire.com/file/ta1mklns3m5qywy/sendfiles.zip. Would you like to help me to check the errors?
Thank you very much for your help.
Best regards,
Nguyen Van Dang

Revision history for this message
Johan Hake (johan-hake) said :
#7

On Tuesday October 4 2011 00:10:59 Nguyen Van Dang wrote:
> Question #173088 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/173088
>
> Status: Answered => Open
>
> Nguyen Van Dang is still having a problem:
> Hi Johan,
> Here is the link of my files:
> http://www.mediafire.com/file/ta1mklns3m5qywy/sendfiles.zip. Would you
> like to help me to check the errors? Thank you very much for your help.
> Best regards,
> Nguyen Van Dang

There is a bug for using OpenMP assemble for scalars.

  https://bugs.launchpad.net/bugs/860040

A work around is to set the number of threads to 0 at start (default) and then
to what ever you want infront of the more time consuming vector and Matrix
assembles.

    parameters["num_threads"] = 0;
    parameters["graph_coloring_library"] = "Zoltan";

...

    parameters["num_threads"] = 4;
    mass_matrix.f = one;
    assemble(M, mass_matrix);

Johan

Revision history for this message
Nguyen Van Dang (dang-1032170) said :
#8

Hi,
It worked. However, it didn't improve the running time on my computer. I took the mesh nx=ny=500.

When I set:
parameters["num_threads"] = 0;

mass_matrix.f = one;
assemble(M,mass_matrix);

running time was
real 0m5.421s
user 0m1.396s
sys 0m0.224s

When I set:
parameters["num_threads"] = 3;

mass_matrix.f = one;

assemble(M,mass_matrix);

running time was
real 7m12.790s
user 0m10.349s
sys 0m12.261s
with some error scripts.

Can you help me to check if it works on your computer?
Thanks in advance
Nguyen Van Dang

Revision history for this message
Johan Hake (johan-hake) said :
#9

Remember to compare the correct things here. When a Matrix is first assembled
a sparsity patter is built this take times and is done in serial. To properly
measure the time you need to preassemble your matrix and compare the
reassemble time where you have set reset_sparsity=false in assemble. A third
thing to have in mind is that computing the coloring of the mesh (preparing
the cell assemble for shared memory parallelism) takes time. Like the sparsity
pattern this is only done once. This means that making just one assemble of a
matrix will most probably take longer time with more threads as the coloring
can take quite a while.

The actuall speed up also depend on the mesh and the number of threads you
use. Some combinations uses the memory more optimal than others. I have had
good speedup for a 3D mesh on a big cluster, where I on my laptop do not see
the same speed ups.

Only running problem specific benchmarks can tell.

Johan

On Tuesday October 4 2011 16:35:36 Nguyen Van Dang wrote:
> Question #173088 on DOLFIN changed:
> https://answers.launchpad.net/dolfin/+question/173088
>
> Status: Answered => Open
>
> Nguyen Van Dang is still having a problem:
> Hi,
> It worked. However, it didn't improve the running time on my computer. I
> took the mesh nx=ny=500.
>
> When I set:
> parameters["num_threads"] = 0;
>
> mass_matrix.f = one;
> assemble(M,mass_matrix);
>
> running time was
> real 0m5.421s
> user 0m1.396s
> sys 0m0.224s
>
> When I set:
> parameters["num_threads"] = 3;
>
> mass_matrix.f = one;
>
> assemble(M,mass_matrix);
>
>
> running time was
> real 7m12.790s
> user 0m10.349s
> sys 0m12.261s
> with some error scripts.
>
> Can you help me to check if it works on your computer?
> Thanks in advance
> Nguyen Van Dang

Can you help with this problem?

Provide an answer of your own, or ask Nguyen Van Dang for more information if necessary.

To post a message you must log in.