Trove instance went to ERROR because of "chmod: cannot access '/etc/guest_info': No such file or directory"

Asked by Lili

When I tried to launch a Trove database instance with the image built by DIB, the Trove instance went to ERROR, while its associated Nova instance went to ACTIVE. So I logged in that VM and found the following information in log files.

In /var/log/upstart/trove-guest.log, the error is as follows.
chmod: cannot access '/etc/guest_info': No such file or directory

I also looked at /var/log/boot.log and found the following, which really confused me. It seems that Trove Guest started successfully and then failed.
 * Starting Trove Guest^[[74G[ OK ]
Cloud-init v. 0.7.5 running 'modules:config' at Tue, 27 Oct 2015 08:47:33 +0000. Up 151.77 seconds.
 * Starting Trove Guest^[[74G[^[[31mfail^[[39;49m]

And on the guest VM, the Trove source code and database server package (here I used Vertica) can be found under ~/trove and /opt/vertica, respectively.

The following is some details about how OpenStack and Trove are installed, and also how Trove guest database image is built.

OpenStack environment: initially installed following OpenStack Installation Guide for Ubuntu - JUNO, and then upgraded to KILO. After upgrade, a Nova instance was launched successfully. And by checking "nova-manage --version", the output is "2015.1.1". So I made judgment that the upgrade was successful. There are the following service running on OpenStack: Keystone, Nova, Glance, Cinder, Horizon.

Trove: after upgrading OpenStack from JUNO to KILO, installed Trove, following Chapter " Add Database Service" in http://docs.openstack.org/icehouse/install-guide/install/apt/openstack-install-guide-apt-icehouse.pdf (OpenStack Installation Guide for Ubuntu - ICEHOUSE) and with updated configurations found in KILO. After installation, I executed "trove list" and got right output (a table with columns: ID, Name, Datastore, Datastore Version, Status, Flavor ID, Size). And by checking "trove-manage --version", the output is "2015.1.0". So I made the judgement that trove was installed successfully.

(I accidentally installed Trove using root role. I don't have the ability to judge whether that would have influence on the issue I encountered. Appreciate any comment on this.)

Trove guest database image creation: used DIB tool; the details can be found below.

1. Install pre-requisition:
# sudo apt-get install qemu-utils kpartx

2. Install diskimage-builder:
# sudo pip install diskimage-builder

3. Install trove-integration and tripleo-image-elements:
$ git clone http://git.openstack.org/openstack/trove-integration
$ git clone https://git.openstack.org/openstack/tripleo-image-elements.git

4. And then copy trove-integration/scripts/files/keys/ to ~/.ssh and change file permission. So Trove guest instance can be accessed via ssh.

5. Configure following environmental variables with local values.

export HOST_USERNAME
export HOST_SCP_USERNAME
export GUEST_USERNAME
export NETWORK_GATEWAY
export REDSTACK_SCRIPTS
export SERVICE_TYPE
export PATH_TROVE
export ESCAPED_PATH_TROVE
export SSH_DIR
export GUEST_LOGDIR
export ESCAPED_GUEST_LOGDIR
export ELEMENTS_PATH=$REDSTACK_SCRIPTS/files/elements:$PATH_TRIPLEO_ELEMENTS/elements
export DIB_CLOUD_INIT_DATASOURCES="ConfigDrive"
export DATASTORE_PKG_LOCATION=""

6. Create the image:
$ disk-image-create -a amd64 -o ~/images/ubuntu_vertica/ubuntu_vertica -x --qemu-img-options compat=0.10 ubuntu apt-sources vm heat-cfntools cloud-init-datasources ubuntu-guest ubuntu-vertica

7. And then register that image with GLANCE and update the datastore version.

Please feel free to let me know if you need any additional information. Thank you for your time and help in advance.

Question information

Language:
English Edit question
Status:
Solved
For:
OpenStack DBaaS (Trove) Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Launchpad Janitor (janitor) said :
#1

This question was expired because it remained in the 'Open' state without activity for the last 15 days.

Revision history for this message
Craig Vyvial (cp16net) said :
#2

So the /etc/guest_info file should be injected at create time of the trove instance. Could there have been an error in the trove-taskmanager logs that would help determine why this was not injected? Those logs should be located in /var/log/trove/. Sounds like you built and added the image correctly. The minor thing here i think you understand from your message is that there are different elements for each datastore. you mentioned using vertica above and described building mongodb. The first thing the guest agent does is read the /etc/guest_info and /etc/trove/trove-guestagent.conf so this would be before the instance runs.

Another thing i would look at is the cloudinit that is running on the instance. I think this may be the culprit because first the guest says it started ok and then cloudinit ran and then the guest started again and failed to.

Revision history for this message
Lili (lzhang0409) said :
#3

Dear Craig,

Thank you very much for carefully reading my question and providing your thoughts.

Yes. Your clue is right. I solved this problem about a week ago and forgot to update this post in time.

To determine whether trove-taskmanager generated get_info file right, I manually added a DEBUG log in the source code to output the content of get_info file. It turned out that its content was right. So the problem must exit in the next step of the process, where nova injected that file during the instance booting. So I tried to boot a Nova instance by executing the command "nova boot " with "--file " option to inject a file. It turned out to fail. So it is nova's problem. The following steps helped me fix the problem.

1. Install the following packages on compute nodes

# apt-get install libguestfs-tools python-libguestfs
# sudo usermod -a -G kvm yourlogin
# chmod 0644 /boot/vmlinuz*

2. Add the following configuration options in /ect/nova/nova.conf on compute nodes.

[DEFAULT]
compute_driver=libvirt.LibvirtDriver
rootwrap_config = /etc/nova/rootwrap.conf

[libvirt]
virt_type=qemu
inject_partition = -1

3. Restart nova-compute service on compute nodes

# service nova-compute restart

Thank you again for your time.

Regards,
Lili

Revision history for this message
Craig Vyvial (cp16net) said :
#4

Awesome! Glad you have it figured out with some great sleuthing skills.

Revision history for this message
Craig Vyvial (cp16net) said :
#5

Issue was solved with a configuration change and adding packages to the compute nodes for nova.

Revision history for this message
p.cipriano (ctpeter) said :
#6

May i know what should i input here?

export REDSTACK_SCRIPTS
export SERVICE_TYPE
export PATH_TROVE

Revision history for this message
jian.song (jiansong) said :
#7

Maybe I can solve some problems for you:
REDSTACK_SCRIPTS
has actually expired, and the latest one should be TROVESTACK_SCRIPTS. This parameter should refer to the location of the trove-integration from the code point of view. Before n version, it was a stand-alone project. After that, it was merged into trove. In position in trove/integration

Service_Type
From the code point of view service_type refers to the actual type of database, the specific value can be seen here
https://github.com/openstack/trove/blob/master/integration/scripts/functions_qemu#L86

Path_Trove
Should refer to the path stored in the trove code

If the network environment is good enough, it is recommended to use trovestack for image-build, can reduce some of the impact