Cuda support

Asked by Alexander Eulitz [Eugen]

Hey guys,
I barely dare to ask, but does Yade support CUDA?
According to posts I found in the archive, Yade isnt that parallelism friendly, especially when it comes to cluster computing. But what about using graphics card for faster computing?

greetings,
kubeu

Question information

Language:
English Edit question
Status:
Solved
For:
Yade Edit question
Assignee:
No assignee Edit question
Solved by:
Anton Gladky
Solved:
Last query:
Last reply:
Revision history for this message
Alexander Eulitz [Eugen] (kubeu) said :
#1

just found that - in case somebody is interested:
http://vimeo.com/53052481

taken from here: http://mathema.tician.de/software/pycuda

Revision history for this message
Best Anton Gladky (gladky-anton) said :
#2

Hi,

Yade effectively uses OPENMP-parallelism, for some simulations it is
almost linear. That is true, that Yade is not MPI-written program, so it
will not normally work on clusters.

Yade does not support CUDA. As I understand, mostly because none of
us has such equipment/interest thus no motivation. If you want to do it,
it would be nice to see such code in the development branch.

Cheers.

Anton

2012/11/15 Eugen Kubowsky <email address hidden>:
> New question #214316 on Yade:
> https://answers.launchpad.net/yade/+question/214316
>
> Hey guys,
> I barely dare to ask, but does Yade support CUDA?
> According to posts I found in the archive, Yade isnt that parallelism friendly, especially when it comes to cluster computing. But what about using graphics card for faster computing?
>
> greetings,
> kubeu

Revision history for this message
Christian Jakob (jakob-ifgt) said :
#3

Hi,

An implementation of particle sim. with CUDA was already introduced by Simon Green (Nvidia).
For details see here:

www.dps.uibk.ac.at/~cosenza/teaching/gpu/nv_particles.pdf

Christian.

Revision history for this message
Alexander Eulitz [Eugen] (kubeu) said :
#4

Thanks Anton Gladky, that solved my question.

Revision history for this message
Alexander Eulitz [Eugen] (kubeu) said :
#5

Because Yade-User Riccardo Carta asked about it again, I dealt with this topic again.
The link I posted some months ago is quite interesting but nevertheless it seems that it is not of interest for yade. As far as I understand I was wrong in assuming that one could implement Cuda support using Python.
If someone wants to implement it, he will have to do this in basic c++ code...
Considering this, stuff like [1] is more interesting ;)

[1] https://developer.nvidia.com/content/easy-introduction-cuda-c-and-c

Revision history for this message
Bruno Chareyre (bruno-chareyre) said :
#6

Yes, CUDA needs deep diving into the structure of the code and I'm not really sure it is worth spending some time on this as GPU processing may be abandoned in a few years.
There are interesting dicussions on this if you google "GPU Xeon Phi". Quoting [1]:

"In the end, GPGPUs are a work in progress. That is the smiley happy term for the English term, “Bloody mess”. The landscape is littered with broken tools, grand promises, failed projects, changing paradigms, and the occasional success story.
[...] Tools are another win for Phi, everything supports x86 CPUs, and that code simply runs on Phi without so much as a line of code changed. [...] This part simply obsoletes the whole GPGPU paradigm and relegates it to generating pretty pictures, not to serious compute work.
"

[1] http://semiaccurate.com/2012/11/13/what-will-intel-xeon-phi-do-to-the-gpgpu-market/#.UQuNyJETOaA

Revision history for this message
Alexander Eulitz [Eugen] (kubeu) said :
#7

Interesting stuff Bruno, thanks for linkage!

Of course there are publications that state that Cuda was really worth implementing like [1]
So maybe performance would really benefit if parts of calculation is done by Cuda. But unfortunately I dont have enough programming expertise...

[1] http://www.isip.uni-luebeck.de/uploads/tx_wapublications/eusipco2011-mazur.pdf

Revision history for this message
Bruno Chareyre (bruno-chareyre) said :
#8

It worth implementing of course, in the sense that a parallel implementation should normally go faster than a non-parallel implementation, and that 248 cores (GPU) should be better than 4 cores (quad CPU).

The point is GPU (Cuda) needs to completely re-design yade if you want a chance to gain speed, it means a lot of man*hour for the initial implementation and even more man*hour to keep track of the changes in the language itslef. If ordinary CPUs are going to also have hundreds of cores (Xeon Phi) and would run yade without any change in the code, why should we spend time adapting yade to a GPU technology that may be soon obsolete?

Revision history for this message
Anton Gladky (gladky-anton) said :
#9

Please, note also, that CUDA is a proprietary technology and, for
example, in Debian we will not be able to link CUDA with Yade,
as Yade will go in so-called "non-free" section of the archive.

Not a big deal for users, but for many distributions such linkage
can be problematic.

Cheers,

Anton

Revision history for this message
Alexander Eulitz [Eugen] (kubeu) said :
#10

Hmmm, as I have seen in my latest performance tests for my current simulation - performance does not profit from a high number of cores. I seems that using one core is best. I still have to complete the tests but it looks like this way. When tests are done I'll give you some curves.
According to this Cuda is likely not to boost the performance in my simulation, isnt it?