Deployment Failed caused by expired Ceph timeout

Asked by Stefan Herrnleben

Hello,

we tried to deploy OpenStack with Mirantis Fuel 7.0. Our test environment contains
1x Controller Node
1x Compute Node
3x Ceph Node
This small setup was choosen for test deployment, for our final deployment (for educational and research use) additional compute and storage nodes will be added.

As soon as we add more than 2 ceph nodes for the initial deployment, the setup fails with

---
Error
Deployment has failed. Method granular_deploy. Failed to execute hook 'shell' command: cd / && ruby /etc/puppet/modules/osnailyfacter/modular/ceph/ceph_ready_check.rb
---

Requesting the Ceph osd status with "ceph osd tree" also notes, that only 2 of the 3 Ceph nodes work as desired. The ODSs of the third Ceph node are status "down".

As soon as we remove one of the Ceph nodes (regardless which of them) the deployment succeeds. Later the third Ceph node can be added to the Environment without any problems. But as soon as we want to use a lot of more storage nodes, this would be an time-consuming process, to add the Ceph nodes sequentially.

Does anybody have an idea, why the described timeout occurs? Is there a configuration option to increase the timeout to test the deployment with higher values?

Best regards,

Stefan

Question information

Language:
English Edit question
Status:
Solved
For:
Fuel for OpenStack Edit question
Assignee:
No assignee Edit question
Solved by:
Stefan Herrnleben
Solved:
Last query:
Last reply:
Revision history for this message
Denis Klepikov (dklepikov) said :
#1

Hello.

Please try to increase timeout for ceph-osd nodes deployment please follow the next steps:

1 On fuel master node increase timeout for ceph-osd and compute roles
Open file:
/etc/puppet/modules/osnailyfacter/modular/roles/tasks.yaml

2 Into sections
- id: top-role-compute
and
- id: top-role-ceph-osd
change timeout to 7200
but you can set another timeout in seconds.
and save file.

For example:

id: top-role-compute
type: puppet
groups: [compute]
required_for: [deploy_end]
requires: [hosts, firewall]
parameters:
puppet_manifest: /etc/puppet/modules/osnailyfacter/modular/roles/compute.pp
puppet_modules: /etc/puppet/modules
timeout: 7200
id: top-role-ceph-osd
type: puppet
groups: [ceph-osd]
required_for: [deploy_end]
requires: [hosts, firewall]
parameters:
puppet_manifest: /etc/puppet/modules/osnailyfacter/modular/roles/ceph-osd.pp
puppet_modules: /etc/puppet/modules
timeout: 7200

Then you can deploy environment with increased timeouts.

Revision history for this message
Stefan Herrnleben (stefan-herrnleben) said :
#2

Hello Denis,

sorry for the long response time. We had some hardware problems so I was not able to try your suggestion in the last weeks. I had to increase another timeout and now our deployment works. Thanks for your response!

Best regards

Stefan