Does Savanna have local storage?

Asked by John-Doe Junior

I've just watched a presentation on youtube and I'm unexperimented with open-stack.

What are the options to manage the data layer? I've seen there is a plugin for switf but how can I have my data on the same physical host as the dn/tt to limit the network I/O?

Thanks, Guillaume
PS: Is there a mailing-list ?

Question information

Language:
English Edit question
Status:
Solved
For:
Sahara Edit question
Assignee:
No assignee Edit question
Solved by:
Alexander Kuznetsov
Solved:
Last query:
Last reply:
Revision history for this message
Alexander Ignatov (aignatov) said :
#1

Hi, Guillaume,

Can you explain your usecase in details. As I understand from your question you wants to run M/R jobs over your own data, right?
In that case you need to copy your data to provisioned cluster by Savanna, or copy data to Swift which is suppoted by Hadoop provisioned by Savanna.

The proper email address: <email address hidden>

Revision history for this message
Matthew Farrellee (mattf) said :
#2

Be sure to put [savanna] in the subject of your message to openstack-dev@...

Revision history for this message
John-Doe Junior (x6i4uybz) said :
#3

Hi Alexander,

I'm working on a traditionnal hadoop cluster with only physical hosts.

In the case of a production hadoop cluster (or long-lived cluster), I want the best performance. For that, I need to limit all bottlenecks especially the network I/O. So I need to have a local storage for the worker node.

Is it possible to have a kind of that architecture if my cluster is hosted on open-stack.
How can I create a virtual worker-node which can read the input data from the hosts's physical disks for instance.

If I use swift for the input/output data, data has to read/write through the network, is that right? I don't want that in my case.

Revision history for this message
Best Alexander Kuznetsov (akuznetsov) said :
#4

To provide a good performance for Hadoop cluster in a virtual environment, Savanna has on the road map feature a directly attached disc for vms. The patch for this already committed to Cinder and Savanna will support this functionality in the future. Also, you can look at VMWare paper http://www.mellanox.com/pdf/case_studies/VMW-Hadoop-Performance-vSphere5.pdf about Hadoop performance in the virtual environment.

Revision history for this message
John-Doe Junior (x6i4uybz) said :
#5

Thanks Alexander Kuznetsov, that solved my question.