Cluster creation state in Starting only

Asked by Abhimanyu

Hi All,
I am able to create cluster with different nodes , instances are also launched , but the Cluster state does not change top Active , it remains in Starting state.
Also the service URL's are not generated in the cluster.

Question information

Language:
English Edit question
Status:
Answered
For:
Sahara Edit question
Assignee:
Alexander Ignatov Edit question
Last query:
Last reply:
Revision history for this message
Alexander Ignatov (aignatov) said :
#1

Hi Abhimanyu,

Very often you could see that after starting cluster by Savanna all instances are provisioned in the OpenStack successfully.
But when instances are up and have status Active, it doesn't mean that Hadoop cluster is ready right now.
Because Savanna will take time to connect to all cluster nodes and push configs to it.

So, after the cluster starting you should wait some time to get Hadoop cluster ready. It is expected and can takes up to 2-5 minutes. And setvice Urls are generated at the final moment before the cluster ready.

But if you have another problem please describe it in details.

Revision history for this message
Sergey Lukjanov (slukjanov) said :
#2

Abhimanyu, does your question solved?

Revision history for this message
Gowri (gouribudal) said :
#3

Hi,

 I have a similar problem. I am able to launch the instances ,but the cluster is in ERROR State.
I can SSH into instances,and I acn login into the instances through Open stack console.But the cluster is in ERROR state. What could be the problem ?
I used http://savanna-files.mirantis.com/savanna-0.2-vanilla-1.1.2 image .

Please help me out.

Revision history for this message
Dmitry Mescheryakov (dmitrymex) said :
#4

Hello,

Did you search the Savanna logs for errors/exceptions? Something should be there.

Revision history for this message
Alexander Ignatov (aignatov) said :
#5

Gowri, hi! Which version of Savanna do you use? If this version greater than savanna-0.2 so the name of the image your posted in your question is not correct.

Revision history for this message
Gowri (gouribudal) said :
#6

 Hi ,

Thanks for the response.

I am using Fuel 4.0 (latest) installer. I tried with the image centos-6_64-hdp-1.3.qcow2(link http://docs.openstack.org/developer/savanna/userdoc/hdp_plugin.html ) . For this image, the instances are active, I can SSh into them but the cluster remains in 'Waiting state'. Is there something wrong with the steps? ( I followed http://savanna.readthedocs.org/en/latest/horizon/dashboard.user.guide.html , and while registering image , I gave username as 'root' as per the link)

Savanna/api-log has the following

   error during command execution "sudo chown -R $USER:$USER /opt/oozie/conf"
   Return code 1
   chown: cannot access ' /opt/oozie/conf': No such file or directory

 and at the ending of the logs I can see the following error :
 cluster creating fails with exception :'NoneType'object has no attribute 'name'

What could be the problem ?

Revision history for this message
Gowri (gouribudal) said :
#7

And the console of any instance says ' Booting from Hard Disk ..'

Revision history for this message
Gowri (gouribudal) said :
#8

 Hi ,
Sorry, I think I messed up the details.

1. When I used the image http://savanna-files.mirantis.com/savanna-0.2-vanilla-1.1.2 image , the instances are active and I can SSh into them but the cluster turns into 'ERROR' state.
   Savanna/api-log has the following:
   >error during command execution "sudo chown -R $USER:$USER /opt/oozie/conf"
   >Return code 1
   >chown: cannot access ' /opt/oozie/conf': No such file or directory
 and at the ending of the logs I can see the following error :
   >cluster creating fails with exception :'NoneType'object has no attribute 'name'

2.I tried with the image centos-6_64-hdp-1.3.qcow2(link http://docs.openstack.org/developer/savanna/userdoc/hdp_plugin.html ) .
  For this image, the instances are active but I can not ssh into them.The console of any instance says ' Booting from Hard Disk ..' and the cluster remains in ' WAITING State '.
The log file ends with ' INFO urllib3.connectionpool [-] Starting new HTTP connection (1):172.16.8.13' ( The IP is public Ip of controller node)

What could be the problem ? Am I using wrong images or am I missing some steps ?? Please help me out.
(I am using Fuel v 4.0.)

Revision history for this message
Andrew Lazarev (alazarev) said :
#9

Gowri,

1. Fuel 4.0 uses savanna 0.3. So, images from savanna 0.2 will not work. See http://docs.mirantis.com/fuel/fuel-4.0/user-guide.html

2. "Booting from hard disk" message is a sing that image you are using is broken. Please make sure that it is downloaded correctly and uploaded to glance with "--disk-format qcow2 --container-format=bare". centos-6_64-hdp-1.3.qcow2 should work fine with savanna 0.3.

3. It is better to use https://savanna.readthedocs.org/en/0.3/index.html for 0.3. But don't see much difference in your case.

Revision history for this message
Gowri (gouribudal) said :
#10

Hi,

Thanks Andrew Lazarev,

I tried with centos-6-64-hdp-vanilla.qcow2 image as well , but I am not able to launch cluster yet. The instances are terminating after ERROR status. savanna log has error ' RuntimeError : node newcluster_vanilla_worker-001 has error status '.
what does this mean ?Am I missing something ?

I will download centos-6_64-hdp-1.3.qcow2 image again and try it out. I will let you know the results.

Thanks for the support .

Revision history for this message
Andrew Lazarev (alazarev) said :
#11

'RuntimeError : node newcluster_vanilla_worker-001 has error status ' means that OpenStack moved the instance to ERROR state. You should check OpenStack logs (start with nova) to see what happened. The most common reasons I faced are network misconfiguration and quota limit. But it could be anything else in your particular case.

Also I see 'vanilla' word in node naming while you are using image for HDP. Please make sure that you are using image for right plugin (and version of savanna).

Revision history for this message
Gowri (gouribudal) said :
#12

Hi Andrew Lazarev,

As per your suggestion, I used VM image centos-6_64-hdp-1.3.qcow2 from [link : http://savanna.readthedocs.org/en/latest/userdoc/hdp_plugin.html].

The VM instances are in ACTIVE state, I can ping and SSH into them .But the cluster is in 'configuring' state from one day.

I used centos-6_64-hdp-1.3.qcow2 image .
The savanna logs is as follows:

savanna.plugins.hdp.versions.1_3_2.versionhandler [-] Waiting to connect to ambari server ...
2014-02-10 04:24:21.703 19754 INFO savanna.plugins.hdp.versions.1_3_2.versionhandler [-] Waiting to connect to ambari server ...
2014-02-10 04:24:26.707 19754 INFO urllib3.connectionpool [-] Starting new HTTP connection (1): 172.16.8.28
2014-02-10 04:24:26.708 19754 INFO urllib3.connectionpool [-] Starting new HTTP connection (1): 172.16.8.28
2014-02-10 04:24:26.709 19754 INFO savanna.plugins.hdp.versions.1_3_2.versionhandler [-] Waiting to connect to ambari server ...
2014-02-10 04:24:26.709 19754 INFO savanna.plugins.hdp.versions.1_3_2.versionhandler [-] Waiting to connect to ambari server ...

I found this information from https://bugs.launchpad.net/savanna/+bug/1259836 that the latest HDP vm image should resolve this issue .But I am still facing this problem.

What could be the problem ? please help.

Revision history for this message
Gowri (gouribudal) said :
#13

Hi Andrew Lazarev ,

Sorry for the confusion . I think I have posted this in another thread as well. I am quite new to this and so little confused .

Thanks for the support and helping me with the progress.

Revision history for this message
Gowri (gouribudal) said :
#14

Hi Andrew Lazarev,

I found out that the image is supposed to have jdk installed but i could not find it in annaconda.program.log.

I was able to find all the required packages that are installed in the log (like hadoop-libhdfs , hadoop-native, hadoop-pipes,
hadoop-sbin ,hadoop-lzo, hadoop-lzo-native ,mysql-server,httpd etc from the list provided in http://savanna.readthedocs.org/en/latest/userdoc/hdp_plugin.html )
Can you please clarify is jdk pre-installed on the HDP vm image ?

Right now the VM instances dont have internet connection to the outside world ( I need to configure proxy and all ) ,can I continue with this image ?
Could this be causing the problem ?

Revision history for this message
Jonathan Maron (jmaron) said :
#15

That could be causing the problem. There are still a number of packages that could not be pre-installed due to a bug in the ambari installation sequence. That bug is being tracked and should be addressed. We hope to have a "disconnected" image available soon.

You could log onto the ambari server VM and check the server status to confirm.

Revision history for this message
Gowri (gouribudal) said :
#17

Hi Jonathan Maron ,

Thanks for the response.

When I tried to start ambari server on VM ,it shows error" :Unable to detect the system user for Ambari Server.Please run 'ambari-server setup ' command to create user"

As I mentioned , the VM instances do not have internet connection to the outside world ( I need to configure proxy and all ) . Is there a workaround for this ? Can I download manually and create repo or something ? will the configuration of the cluster proceed if I do this ?

How can I know what other packages has to be installed ?

Revision history for this message
Gowri (gouribudal) said :
#18

Hi Jonathan Maron,

Any updates on the "disconnected " images ?

Can you help with this problem?

Provide an answer of your own, or ask Abhimanyu for more information if necessary.

To post a message you must log in.