instance get more than one fixed ip(grizzly-g3)

Asked by shu, xinxin

i have install grizzly-g3, but quantum does not work well, when i boot 128 instances, i found one of instances got more than one fixed ip, howerver, when i boot 64 intances, it nerver happened, besides that , sometimes i can not ping vm with floatingip, i did not find any error message in my quantum log( all the files in the /var/log/quantum), follows are the error output and configurations

| 97a93600-38e2-4700-9851-15ef56c1d628 | slave | ACTIVE | demo-int-net=172.16.100.4 |
| 99aeb6b8-4252-4839-a7d1-f87853116100 | slave | ACTIVE | demo-int-net=172.16.100.117 |
| 9aa82a35-c9f1-4f44-a108-d14e74eec231 | slave | ACTIVE | demo-int-net=172.16.100.108, 172.16.100.109 |
| 9b6b1289-c450-4614-b647-e5ebdffff80a | slave | ACTIVE | demo-int-net=172.16.100.5 |
| 9e0d3aa5-0f15-4b24-944a-6d6c3e18ce64 | slave | ACTIVE | demo-int-net=172.16.100.35 |
| 9ea62124-9128-43cc-acdd-142f1e7743d6 | slave | ACTIVE | demo-int-net=172.16.100.132 |

my setup : one db host(db service), one glance host(glance service), on api host(keystone,nova-api,nova-scheduler, nova-conductor, quantum-server,quantum-dhcp, quantum-l3-agent,quantum-plugin-openvswitch-agent), eight compute host(each host with nova-compute, quantum-plugin-openvswitch-agent), i check that all the service on all hosts works well

i used vlan type network and openvswitch plugin:

my quantum.conf

[DEFAULT]
# Default log level is INFO
# verbose and debug has the same result.
# One of them will set DEBUG log level output
debug = True

# Address to bind the API server
bind_host = 0.0.0.0

# Port the bind the API server to
bind_port = 9696

# Quantum plugin provider module
# core_plugin =
core_plugin = quantum.plugins.openvswitch.ovs_quantum_plugin.OVSQuantumPluginV2
# Advanced service modules
# service_plugins =

# Paste configuration file
api_paste_config = /etc/quantum/api-paste.ini

# The strategy to be used for auth.
# Supported values are 'keystone'(default), 'noauth'.
auth_strategy = keystone

# Modules of exceptions that are permitted to be recreated
# upon receiving exception data from an rpc call.
# allowed_rpc_exception_modules = quantum.openstack.common.exception, nova.exception
# AMQP exchange to connect to if using RabbitMQ or QPID
control_exchange = quantum

# RPC driver. DHCP agents needs it.
notification_driver = quantum.openstack.common.notifier.rpc_notifier

# default_notification_level is used to form actual topic name(s) or to set logging level
default_notification_level = INFO

# Defined in rpc_notifier, can be comma separated values.
# The actual topic names will be %s.%(default_notification_level)s
notification_topics = notifications

[QUOTAS]
# resource name(s) that are supported in quota features
# quota_items = network,subnet,port

# default number of resource allowed per tenant, minus for unlimited
# default_quota = -1

# number of networks allowed per tenant, and minus means unlimited
# quota_network = 10

# number of subnets allowed per tenant, and minus means unlimited
# quota_subnet = 10

# number of ports allowed per tenant, and minus means unlimited
quota_port = 5000
quota_floatingip = 5000

# default driver to use for quota checks
# quota_driver = quantum.quota.ConfDriver

# =========== items for agent management extension =============
# Seconds to regard the agent as down.
# agent_down_time = 5
# =========== end of items for agent management extension =====

[DEFAULT_SERVICETYPE]
# Description of the default service type (optional)
# description = "default service type"
# Enter a service definition line for each advanced service provided
# by the default service type.
# Each service definition should be in the following format:
# <service>:<plugin>[:driver]

[SECURITYGROUP]
# If set to true this allows quantum to receive proxied security group calls from nova
# proxy_mode = False

[AGENT]
root_helper = sudo quantum-rootwrap /etc/quantum/rootwrap.conf

# =========== items for agent management extension =============
# seconds between nodes reporting state to server, should be less than
# agent_down_time
# report_interval = 4

# =========== end of items for agent management extension =====

[keystone_authtoken]
auth_host = host-keystone
auth_port = 35357
auth_protocol = http
admin_tenant_name = demoTenant
admin_user = test
admin_password = 123456
signing_dir = /var/lib/quantum/keystone-signing

my dhcp_agent.ini

[DEFAULT]
# Where to store dnsmasq state files. This directory must be writable by the
# user executing the agent.
state_path = /var/lib/quantum

# OVS based plugins(OVS, Ryu, NEC, NVP, BigSwitch/Floodlight)
interface_driver = quantum.agent.linux.interface.OVSInterfaceDriver

# The agent can use other DHCP drivers. Dnsmasq is the simplest and requires
# no additional setup of the DHCP server.
dhcp_driver = quantum.agent.linux.dhcp.Dnsmasq

my ovs_quantum_plugin.ini configure file

[DATABASE]
# This line MUST be changed to actually run the plugin.
sql_connection = mysql://quantum:quantum@host-db/quantum

# Database reconnection interval in seconds - if the initial connection to the
# database fails
reconnect_interval = 2

[OVS]
# (StrOpt) Type of network to allocate for tenant networks. The
# default value 'local' is useful only for single-box testing and
# provides no connectivity between hosts. You MUST either change this
# to 'vlan' and configure network_vlan_ranges below or change this to
# 'gre' and configure tunnel_id_ranges below in order for tenant
# networks to provide connectivity between hosts. Set to 'none' to
# disable creation of tenant networks.

tenant_network_type=vlan

network_vlan_ranges = DemoNet:1:4094

bridge_mappings = DemoNet:DemoBridge

[AGENT]
# Agent's polling interval in seconds
polling_interval = 2

[SECURITYGROUP]

when i execute "quantum router-gateway-set" to add external network id as the router gateway, i found that the status of port for external network id in the router is DOWN. does it matters, if it does, how can i fixed it.

this blocked me for serveral days , can someone help me sovle it, any help will be appreciated.

Question information

Language:
English Edit question
Status:
Answered
For:
neutron Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
shu, xinxin (xinxin-shu) said :
#1

ovs-vsctl show on api host

fa0399c8-2abc-4161-8a7c-18b7404f1836
    Bridge br-ex
        Port br-ex
            Interface br-ex
                type: internal
        Port "qg-28ba8355-4c"
            Interface "qg-28ba8355-4c"
                type: internal
        Port "eth2"
            Interface "eth2"
    Bridge br-int
        Port "qr-a69d9f83-ff"
            tag: 1
            Interface "qr-a69d9f83-ff"
                type: internal
        Port int-DemoBridge
            Interface int-DemoBridge
        Port "tap9d3ab4cf-59"
            tag: 1
            Interface "tap9d3ab4cf-59"
                type: internal
        Port br-int
            Interface br-int
                type: internal
    Bridge DemoBridge
        Port "eth1"
            Interface "eth1"
        Port phy-DemoBridge
            Interface phy-DemoBridge
        Port DemoBridge
            Interface DemoBridge
                type: internal
    ovs_version: "1.4.3"

ovs-vsctl show on compute host

Bridge br-int
        Port "tapc6ab6692-92"
            tag: 1
            Interface "tapc6ab6692-92"
        Port "tapdd3bbb21-1a"
            tag: 1
            Interface "tapdd3bbb21-1a"
  Bridge DemoBridge
        Port "bond0"
            Interface "bond0"
        Port phy-DemoBridge
            Interface phy-DemoBridge
        Port DemoBridge
            Interface DemoBridge
                type: internal
    ovs_version: "1.4.3"

Revision history for this message
dan wendlandt (danwent) said :
#2

the multiple IPs for a VM seems like a bug, assuming you made the same request for all VMs (which it sounds like you did).

Can you provide the output of quantum port-list (to confirm that there is also one port that has multiple IPs, not that nova is somehow associating multiple ports with the instance).

If there is a port with two IPs, can you try to reproduce this with verbose=True and debug=True, and post the log somewhere?

If not, then I'd be interesting in seeing the nova-compute log for the host running the VM with multiple IPs. Its possible that something strange is happening on the nova side to result in multiple ports being associated with a single VM.

Revision history for this message
shu, xinxin (xinxin-shu) said :
#3

i list ports of the instance which has two fixed ips, below is output:

+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
| id | name | mac_address | fixed_ips |
+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
| 1b93218d-2eb2-4064-b70c-d6562724fc47 | | fa:16:3e:10:7d:8c | {"subnet_id": "24be33c7-64f9-4695-811d-33eeb4ad75da", "ip_address": "172.16.100.108"} |
| fc44d25d-574e-4c09-a6fa-74cd522b0e3c | | fa:16:3e:66:e2:60 | {"subnet_id": "24be33c7-64f9-4695-811d-33eeb4ad75da", "ip_address": "172.16.100.109"} |
+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+

i list all networks interfaces in this vm, only find an interface with fixed ip 172.16.100.108

here is the link of nova-compute log,"http://paste.openstack.org/show/34579/"

 the id of this instance is 9aa82a35-c9f1-4f44-a108-d14e74eec231

Revision history for this message
dan wendlandt (danwent) said :
#4

To me, this says output says that 108 and 109 are actually on different
ports.

Can you do a quantum show-port <port-uuid> for both fo the ports below? In
particular, I want to confirm that they have the same device_id.

From the looks of it, this looks to be an issue in Nova, as quantum itself
has not actually allocated multiple ips to a port. Rather, pending the
output i've requested above, it seems like nova created two ports for an
instance, rather than one.

Looking through your logs, there are an extremely high number of exceptions
and tracebacks in your nova log. I wonder if any of those failures caused
an instance to be respawned, without the quantum port for the initial
creation being cleaned up.

Unfortunately, the nova-compute log seems filled with tracebacks unrelated
to quantum, so its hard to isolate a possible cause of the issue you're
seeing above. It seems like you have some database schema sync issues at
the least. I'd suggest cleaning up these issues and seeing if you can
still repro this.

Dan

On Mon, Mar 25, 2013 at 7:31 PM, shu, xinxin <
<email address hidden>> wrote:

> Question #225158 on quantum changed:
> https://answers.launchpad.net/quantum/+question/225158
>
> shu, xinxin posted a new comment:
> i list ports of the instance which has two fixed ips, below is output:
>
>
> +--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
> | id | name | mac_address |
> fixed_ips
> |
>
> +--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
> | 1b93218d-2eb2-4064-b70c-d6562724fc47 | | fa:16:3e:10:7d:8c |
> {"subnet_id": "24be33c7-64f9-4695-811d-33eeb4ad75da", "ip_address":
> "172.16.100.108"} |
> | fc44d25d-574e-4c09-a6fa-74cd522b0e3c | | fa:16:3e:66:e2:60 |
> {"subnet_id": "24be33c7-64f9-4695-811d-33eeb4ad75da", "ip_address":
> "172.16.100.109"} |
>
> +--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
>
> i list all networks interfaces in this vm, only find an interface with
> fixed ip 172.16.100.108
>
> here is the link of nova-compute
> log,"http://paste.openstack.org/show/34579/"
>
> the id of this instance is 9aa82a35-c9f1-4f44-a108-d14e74eec231
>
> --
> You received this question notification because you are an answer
> contact for quantum.
>

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dan Wendlandt
Nicira, Inc: www.nicira.com
twitter: danwendlandt
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Revision history for this message
dan wendlandt (danwent) said :
#5

during the portion of the log where you are deleting vms, i see two quantum exceptions, but they are "port not found" errors, that seem correlated to nova instance not found errors, and aren't the port-ids mentioned above, so I doubt it is related.

2013-03-26 03:58:04.965 ERROR nova.network.quantumv2.api [req-b6fe4986-507d-4735-a504-adbf4386b5bc c9ae6c830b3b4d40925e5fe23f671893 ebacbc9c99f84607920b2ac749608623] Failed to delete quantum port 03347623-a4c8-473f-bde6-845d19c0009c
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api Traceback (most recent call last):
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api File "/usr/lib/python2.7/dist-packages/nova/network/quantumv2/api.py", line 309, in deallocate_for_instance
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api quantumv2.get_client(context).delete_port(port['id'])
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api File "/usr/lib/python2.7/dist-packages/quantumclient/v2_0/client.py", line 102, in with_params
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api ret = self.function(instance, *args, **kwargs)
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api File "/usr/lib/python2.7/dist-packages/quantumclient/v2_0/client.py", line 236, in delete_port
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api return self.delete(self.port_path % (port))
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api File "/usr/lib/python2.7/dist-packages/quantumclient/v2_0/client.py", line 521, in delete
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api headers=headers, params=params)
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api File "/usr/lib/python2.7/dist-packages/quantumclient/v2_0/client.py", line 510, in retry_request
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api headers=headers, params=params)
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api File "/usr/lib/python2.7/dist-packages/quantumclient/v2_0/client.py", line 455, in do_request
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api self._handle_fault_response(status_code, replybody)
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api File "/usr/lib/python2.7/dist-packages/quantumclient/v2_0/client.py", line 436, in _handle_fault_response
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api exception_handler_v20(status_code, des_error_body)
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api File "/usr/lib/python2.7/dist-packages/quantumclient/v2_0/client.py", line 75, in exception_handler_v20
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api raise exceptions.QuantumClientException(message=error_dict)
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api QuantumClientException: Port 03347623-a4c8-473f-bde6-845d19c0009c could not be found on network None
2013-03-26 03:58:04.965 1290 TRACE nova.network.quantumv2.api

Revision history for this message
shu, xinxin (xinxin-shu) said :
#6

i clean all nova-compute.log on compute nodes, and reproduce this error output, there are two instances with multiple fixed ip,

 fb356afb-49ce-4922-82c4-c8460cb8fc4c | slave | ACTIVE | demo-int-net=172.16.100.129 |
| fbf44f20-27b0-4c13-8630-8fafeecff06c | slave | ACTIVE | demo-int-net=172.16.100.24 |
| fd3f892e-c94a-407e-9dab-edb1c48adafc | slave | ACTIVE | demo-int-net=172.16.100.117, 172.16.100.115 |
| ff04d946-1851-4050-8d02-65f4fffb42de | slave | ACTIVE | demo-int-net=172.16.100.50 |
55321d5f-9af1-4736-9599-412742f792ae | slave | ACTIVE | demo-int-net=172.16.100.64 |
| 555c5198-8a44-4dee-98eb-ef704b324f8b | slave | ACTIVE | demo-int-net=172.16.100.111, 172.16.100.112 |
| 557dd69b-ff73-4d67-bf48-6f860ba3b236 | slave | ACTIVE | demo-int-net=172.16.100.63 |

the compute log of instance 555c5198-8a44-4dee-98eb-ef704b324f8b is http://paste.openstack.org/show/34592/

+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
| id | name | mac_address | fixed_ips |
+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
| 21833908-28f0-4ec7-9497-d19eba06e209 | | fa:16:3e:e0:b3:cc | {"subnet_id": "24be33c7-64f9-4695-811d-33eeb4ad75da", "ip_address": "172.16.100.111"} |
| 9637e6e5-673e-4443-9b0e-27f15ed3f92b | | fa:16:3e:bb:f0:f4 | {"subnet_id": "24be33c7-64f9-4695-811d-33eeb4ad75da", "ip_address": "172.16.100.112"} |
+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+

quan

Revision history for this message
shu, xinxin (xinxin-shu) said :
#7

sorry about that i click on the wrong button,
i clean all nova-compute.log on compute nodes, and reproduce this error output, there are two instances with multiple fixed ip,

 fb356afb-49ce-4922-82c4-c8460cb8fc4c | slave | ACTIVE | demo-int-net=172.16.100.129 |
| fbf44f20-27b0-4c13-8630-8fafeecff06c | slave | ACTIVE | demo-int-net=172.16.100.24 |
| fd3f892e-c94a-407e-9dab-edb1c48adafc | slave | ACTIVE | demo-int-net=172.16.100.117, 172.16.100.115 |
| ff04d946-1851-4050-8d02-65f4fffb42de | slave | ACTIVE | demo-int-net=172.16.100.50 |
55321d5f-9af1-4736-9599-412742f792ae | slave | ACTIVE | demo-int-net=172.16.100.64 |
| 555c5198-8a44-4dee-98eb-ef704b324f8b | slave | ACTIVE | demo-int-net=172.16.100.111, 172.16.100.112 |
| 557dd69b-ff73-4d67-bf48-6f860ba3b236 | slave | ACTIVE | demo-int-net=172.16.100.63 |

the compute log of instance 555c5198-8a44-4dee-98eb-ef704b324f8b is http://paste.openstack.org/show/34592/

+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
| id | name | mac_address | fixed_ips |
+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
| 21833908-28f0-4ec7-9497-d19eba06e209 | | fa:16:3e:e0:b3:cc | {"subnet_id": "24be33c7-64f9-4695-811d-33eeb4ad75da", "ip_address": "172.16.100.111"} |
| 9637e6e5-673e-4443-9b0e-27f15ed3f92b | | fa:16:3e:bb:f0:f4 | {"subnet_id": "24be33c7-64f9-4695-811d-33eeb4ad75da", "ip_address": "172.16.100.112"} |
+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+

quantum port-show

root@kapi:/var/log/quantum# quantum port-show 21833908-28f0-4ec7-9497-d19eba06e209
+----------------------+---------------------------------------------------------------------------------------+
| Field | Value |
+----------------------+---------------------------------------------------------------------------------------+
| admin_state_up | True |
| binding:capabilities | {"port_filter": true} |
| binding:vif_type | ovs |
| device_id | 555c5198-8a44-4dee-98eb-ef704b324f8b |
| device_owner | compute:None |
| fixed_ips | {"subnet_id": "24be33c7-64f9-4695-811d-33eeb4ad75da", "ip_address": "172.16.100.111"} |
| id | 21833908-28f0-4ec7-9497-d19eba06e209 |
| mac_address | fa:16:3e:e0:b3:cc |
| name | |
| network_id | 8b933d20-f53b-40e0-a96d-67f3befb6449 |
| security_groups | 065a1cba-03f8-4172-bcee-d635155b965a |
| status | ACTIVE |
| tenant_id | ebacbc9c99f84607920b2ac749608623 |
+----------------------+---------------------------------------------------------------------------------------+

+----------------------+---------------------------------------------------------------------------------------+
| Field | Value |
+----------------------+---------------------------------------------------------------------------------------+
| admin_state_up | True |
| binding:capabilities | {"port_filter": true} |
| binding:vif_type | ovs |
| device_id | 555c5198-8a44-4dee-98eb-ef704b324f8b |
| device_owner | compute:None |
| fixed_ips | {"subnet_id": "24be33c7-64f9-4695-811d-33eeb4ad75da", "ip_address": "172.16.100.112"} |
| id | 9637e6e5-673e-4443-9b0e-27f15ed3f92b |
| mac_address | fa:16:3e:bb:f0:f4 |
| name | |
| network_id | 8b933d20-f53b-40e0-a96d-67f3befb6449 |
| security_groups | 065a1cba-03f8-4172-bcee-d635155b965a |
| status | ACTIVE |
| tenant_id | ebacbc9c99f84607920b2ac749608623 |
+----------------------+---------------------------------------------------------------------------------------+

from above ouput, the two ports are with same device_id. the other instance is same that two ports with same device id

Revision history for this message
Sumit Naiksatam (snaiksat) said :
#8

For whatever it's worth, we have seen this issue a number of times too in our tests and I believe this was happening even prior to grizzly-3. Seems to happen when a bunch of instances are created consecutively (like in a script).

Revision history for this message
Salvatore Orlando (salvatore-orlando) said :
#9

Hi Xinxin,

I do apologise, but I need to ask you more information in order to nail down this issue.

From your tests, does the total number of ports created equate the number of VMs? In other words, do you see any VM without ports, other than VMs with two ports?

I am also wondering if when Quantum creates two ports, nova also creates two VIFs. From the look of the port-show output, as both NICs are ACTIVE, this seems to be the case, but it's probably worth checking the generated libvirt XML file (or logging in into the instance and check for network interfaces)

Also, in order to understand which code paths are activated, how do you specify networks at boot? Do you use --nic network-id, --nic port-id, or you do not specify the --nic option at all?

Revision history for this message
Sumit Naiksatam (snaiksat) said :
#10

I will let Xinxin report his observations, in our case I did see two network interfaces and tap devices being created:

<interface type="bridge">
<mac address="fa:16:3e:a8:fe:96"/>
<model type="virtio"/>
<driver name="qemu"/>
<source bridge="qbr7aac0341-19"/>
<target dev="tap7aac0341-19"/>
<filterref filter="nova-instance-instance-00000004-fa163ea8fe96">
<parameter name="IP" value="10.194.193.4"/>
<parameter name="DHCPSERVER" value="10.194.193.2"/>
<parameter name="PROJNET" value="10.194.193.0"/>
<parameter name="PROJMASK" value="255.255.255.0"/>
</filterref>
</interface>
<interface type="bridge">
<mac address="fa:16:3e:9a:b9:33"/>
<model type="virtio"/>
<driver name="qemu"/>
<source bridge="qbr7ddce382-c7"/>
<target dev="tap7ddce382-c7"/>
<filterref filter="nova-instance-instance-00000004-fa163e9ab933">
<parameter name="IP" value="10.194.193.5"/>
<parameter name="DHCPSERVER" value="10.194.193.2"/>
<parameter name="PROJNET" value="10.194.193.0"/>
<parameter name="PROJMASK" value="255.255.255.0"/>
</filterref>
</interface>

Revision history for this message
Sumit Naiksatam (snaiksat) said :
#11

And, we were using the --nic network-id option.

Revision history for this message
shu, xinxin (xinxin-shu) said :
#12

sorry for my late response, i used --nic network-id, but not used --nic port-id. i can see all vm with at least one port, i have reproduced this error for quantum server log after i cleaned all nova-compute.log on compute hosts and quantum logs on api host, this time, there are four instances that have two fixed ips, each instance has two ports which hae the same device ids, and in the libvirt xml, there are two tap interfaces for these intances. if i log into this vm and execute "ifconfig -a", it will output two nics(one is up with fixed ip, the other is down), i will take one instance as an example. below is detail information.

output of nova list:

| 16599b42-190e-473f-8586-d40746484ce7 | slave | ACTIVE | demo-int-net=172.16.100.21 |
| 18298dc0-3042-42ee-80c6-08483a582712 | slave | ACTIVE | demo-int-net=172.16.100.117, 172.16.100.123 |
| 19bbe682-45df-4ebf-972e-617cfab76c82 | slave | ACTIVE | demo-int-net=172.16.100.17

port list of this instance:

+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
| id | name | mac_address | fixed_ips |
+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
| 222d558f-bb03-4295-919b-9081150ada0c | | fa:16:3e:1d:d1:60 | {"subnet_id": "3a8f62c8-a5ce-4da3-b5a2-1ca612496fc1", "ip_address": "172.16.100.117"} |
| c0002dbb-50bf-48dc-8755-ba6b8a011452 | | fa:16:3e:a1:95:46 | {"subnet_id": "3a8f62c8-a5ce-4da3-b5a2-1ca612496fc1", "ip_address": "172.16.100.123"} |
+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+

status of each port

+----------------------+---------------------------------------------------------------------------------------+
| Field | Value |
+----------------------+---------------------------------------------------------------------------------------+
| admin_state_up | True |
| binding:capabilities | {"port_filter": true} |
| binding:vif_type | ovs |
| device_id | 18298dc0-3042-42ee-80c6-08483a582712 |
| device_owner | compute:None |
| fixed_ips | {"subnet_id": "3a8f62c8-a5ce-4da3-b5a2-1ca612496fc1", "ip_address": "172.16.100.117"} |
| id | 222d558f-bb03-4295-919b-9081150ada0c |
| mac_address | fa:16:3e:1d:d1:60 |
| name | |
| network_id | 4a798345-561f-4c9e-a5c4-aac87f1daf26 |
| security_groups | 49a820ec-cf06-4316-bf4b-c1f20902848f |
| status | ACTIVE |
| tenant_id | ebacbc9c99f84607920b2ac749608623 |
+----------------------+---------------------------------------------------------------------------------------+

+----------------------+---------------------------------------------------------------------------------------+
| Field | Value |
+----------------------+---------------------------------------------------------------------------------------+
| admin_state_up | True |
| binding:capabilities | {"port_filter": true} |
| binding:vif_type | ovs |
| device_id | 18298dc0-3042-42ee-80c6-08483a582712 |
| device_owner | compute:None |
| fixed_ips | {"subnet_id": "3a8f62c8-a5ce-4da3-b5a2-1ca612496fc1", "ip_address": "172.16.100.123"} |
| id | c0002dbb-50bf-48dc-8755-ba6b8a011452 |
| mac_address | fa:16:3e:a1:95:46 |
| name | |
| network_id | 4a798345-561f-4c9e-a5c4-aac87f1daf26 |
| security_groups | 49a820ec-cf06-4316-bf4b-c1f20902848f |
| status | ACTIVE |
| tenant_id | ebacbc9c99f84607920b2ac749608623 |
+----------------------+---------------------------------------------------------------------------------------+

libvirt xml( two tap interfaces)

<!--
WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE
OVERWRITTEN AND LOST. Changes to this xml configuration should be made using:
  virsh edit instance-0000008f
or other application using the libvirt API.
-->

<domain type='kvm'>
  <name>instance-0000008f</name>
  <uuid>18298dc0-3042-42ee-80c6-08483a582712</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static'>2</vcpu>
  <sysinfo type='smbios'>
    <system>
      <entry name='manufacturer'>OpenStack Foundation</entry>
      <entry name='product'>OpenStack Nova</entry>
      <entry name='version'>2013.1</entry>
      <entry name='serial'>db7cce3d-4e73-11e0-bf2c-001e67059c8c</entry>
      <entry name='uuid'>18298dc0-3042-42ee-80c6-08483a582712</entry>
    </system>
  </sysinfo>
  <os>
    <type arch='x86_64' machine='pc-1.2'>hvm</type>
    <boot dev='hd'/>
    <smbios mode='sysinfo'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-model'>
    <model fallback='allow'/>
  </cpu>
  <clock offset='utc'>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='rtc' tickpolicy='catchup'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/bin/kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/var/lib/nova/instances/18298dc0-3042-42ee-80c6-08483a582712/disk'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </disk>
    <controller type='usb' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <interface type='bridge'>
      <mac address='fa:16:3e:1d:d1:60'/>
      <source bridge='br-int'/>
      <virtualport type='openvswitch'>
        <parameters interfaceid='222d558f-bb03-4295-919b-9081150ada0c'/>
      </virtualport>
      <target dev='tap222d558f-bb'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='fa:16:3e:a1:95:46'/>
      <source bridge='br-int'/>
      <virtualport type='openvswitch'>
        <parameters interfaceid='c0002dbb-50bf-48dc-8755-ba6b8a011452'/>
      </virtualport>
      <target dev='tapc0002dbb-50'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </interface>
    <serial type='file'>
      <source path='/var/lib/nova/instances/18298dc0-3042-42ee-80c6-08483a582712/console.log'/>
      <target port='0'/>
    </serial>
    <serial type='pty'>
      <target port='1'/>
    </serial>
    <console type='file'>
      <source path='/var/lib/nova/instances/18298dc0-3042-42ee-80c6-08483a582712/console.log'/>
      <target type='serial' port='0'/>
    </console>
    <input type='tablet' bus='usb'/>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0' keymap='en-us'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </memballoon>
  </devices>
</domain>

run "ovs-vsctl list-ports br-int" on compute host where this instance boot, found these two taps on ports list.

nova-compute.log on this compute host : http://paste.openstack.org/show/34699/

Revision history for this message
Aaron Rosen (arosen) said :
#13

Hi,

I've looked into this and tried to reproduce it but have been unsuccessful. I don't see anything off hand in the code either :/ . One thing I tried was running two instances of nova-compute on the same host which actually does work. When doing this I ran the following script several times (see if a vm gets two ips from this network....delete all vms...run again).. but didn't see the issue.

$ cat boot.sh
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &

Just to rule it out is there any chance that you might have two instances of nova-compute running on the same host? Could you provide the following output?

ps -eaf | grep nova

and (just for the heck of it the output to this so we can test with the same type of network shared/provider etc.. in the rare change this matters)
quantum net-show 8b933d20-f53b-40e0-a96d-67f3befb6449
quantum subnet-show 24be33c7-64f9-4695-811d-33eeb4ad75da

Thanks

Revision history for this message
Aaron Rosen (arosen) said :
#14

Hi,

I've looked into this and tried to reproduce it but have been unsuccessful. I don't see anything off hand in the code either :/ . One thing I tried was running two instances of nova-compute on the same host which actually does work. When doing this I ran the following script several times (see if a vm gets two ips from this network....delete all vms...run again).. but didn't see the issue.

$ cat boot.sh
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &
nova boot --image cirros-0.3.1-x86_64-uec --flavor 1 --nic net-id=effd0a91-923e-4345-a368-5235b3238bf6 asdf &

Just to rule it out is there any chance that you might have two instances of nova-compute running on the same host? Could you provide the following output?

ps -eaf | grep nova

and (just for the heck of it the output to this so we can test with the same type of network shared/provider etc.. in the rare change this matters)
quantum net-show 8b933d20-f53b-40e0-a96d-67f3befb6449
quantum subnet-show 24be33c7-64f9-4695-811d-33eeb4ad75da

Thanks

Revision history for this message
Sumit Naiksatam (snaiksat) said :
#15

We have seen this behavior about 20% of the time, not deterministic in our setup. We were running one instance of nova-compute on the host.

Revision history for this message
dan wendlandt (danwent) said :
#16

Sumit, if you are able to see this predictably, can you insert some additional logging calls to understand the flow that leads to these duplicate ports? For example, are both calls made within the same call to allocate_for_instance (e.g., is the the nets array somehow containing the same network twice?). Or is this the result of multiple calls to allocate_for_instance? Logging a msg with the instance-id when we enter allocate_for_instance, then having a log message indicating the quantum port ids allocated for the instance-id would be valuable (I would argue that latter should be something we actually push to nova for inclusion, as it seems appropriate at the INFO or DEBUG level.

Revision history for this message
Sumit Naiksatam (snaiksat) said :
#17

Yes, I am waiting for this to show up again. We see this in the automated infrastructure. Unfortunately in our setup it's unpredictable, but has shown up fairly often in the past.

Revision history for this message
shu, xinxin (xinxin-shu) said :
#18

hi Aaron, output of "ps -aef | grep nova"(eight compute hosts) :

nova 13998 1 0 02:04 ? 00:05:16 /usr/bin/python /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf
nova 12501 1 0 02:04 ? 00:04:59 /usr/bin/python /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf
nova 24958 1 0 02:04 ? 00:05:54 /usr/bin/python /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf
nova 22490 1 0 02:04 ? 00:05:20 /usr/bin/python /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf
nova 11333 1 0 02:04 ? 00:04:54 /usr/bin/python /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf
nova 8137 1 0 02:04 ? 00:05:00 /usr/bin/python /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf
nova 17029 1 0 02:04 ? 00:05:19 /usr/bin/python /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf
nova 12178 1 0 02:04 ? 00:05:50 /usr/bin/python /usr/bin/nova-compute --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf

my internal network(quantum net-show):

+---------------------------+--------------------------------------+
| Field | Value |
+---------------------------+--------------------------------------+
| admin_state_up | True |
| id | 36a78a32-9488-410d-86d2-22578767d6c9 |
| name | demo-int-net |
| provider:network_type | vlan |
| provider:physical_network | DemoNet |
| provider:segmentation_id | 100 |
| router:external | False |
| shared | False |
| status | ACTIVE |
| subnets | ff4ebc46-5e94-4884-90da-dd66e9b3572b |
| tenant_id | ebacbc9c99f84607920b2ac749608623 |
+---------------------------+--------------------------------------+

my external network:

+---------------------------+--------------------------------------+
| Field | Value |
+---------------------------+--------------------------------------+
| admin_state_up | True |
| id | 5613fb31-8c0d-430e-8099-0a67b9a65c52 |
| name | demo-ext-net |
| provider:network_type | vlan |
| provider:physical_network | DemoNet |
| provider:segmentation_id | 1 |
| router:external | True |
| shared | False |
| status | ACTIVE |
| subnets | 57350a07-2329-4cae-9423-607a1cb18ed3 |
| tenant_id | ebacbc9c99f84607920b2ac749608623 |
+---------------------------+--------------------------------------+

Revision history for this message
Sumit Naiksatam (snaiksat) said :
#19

Unfortunately we have not been able to reproduce this in any of the runs since yesterday. Will keep monitoring.

Revision history for this message
Sumit Naiksatam (snaiksat) said :
#20

This happened again today, however the automation run got wiped out after the test failure (on account of the dual IPs). We have changed the automation script so that this does not happen the next time. Will investigate once we reproduce this again and have live system. Sorry, I am not providing too much additional information, but I just wanted to report back.

Revision history for this message
dan wendlandt (danwent) said :
#21

Thanks for the update Sumit.

Not sure if you saw my comments above, but it might be good to insert some
logging statements into the nova code that talks to quantum, otherwise i'm
not sure that we'll have much more information to go on than the info
already posted in the question. The quantum logs would be useful as well.

Dan

On Tue, Apr 2, 2013 at 4:06 PM, Sumit Naiksatam <
<email address hidden>> wrote:

> Question #225158 on quantum changed:
> https://answers.launchpad.net/quantum/+question/225158
>
> Sumit Naiksatam posted a new comment:
> This happened again today, however the automation run got wiped out
> after the test failure (on account of the dual IPs). We have changed the
> automation script so that this does not happen the next time. Will
> investigate once we reproduce this again and have live system. Sorry, I
> am not providing too much additional information, but I just wanted to
> report back.
>
> --
> You received this question notification because you are an answer
> contact for quantum.
>

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dan Wendlandt
Nicira, Inc: www.nicira.com
twitter: danwendlandt
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Revision history for this message
Sumit Naiksatam (snaiksat) said :
#22

Here is the nova log as we reproduced this: http://paste.openstack.org/show/35783/

Sorry Dan, I could not add any additional log statements in this run, but I am looking at the setup.

Revision history for this message
dan wendlandt (danwent) said :
#23

Ok, could you possibly post the portions of the relevant nova-compute +
quantum logs, along with a port-show for each of the duplicate ports (or
some other way of us seeing the UUIDs of the duplicate ports, and the
instance id of the vm)? thanks.

On Thu, Apr 11, 2013 at 12:36 AM, Sumit Naiksatam <
<email address hidden>> wrote:

> Question #225158 on quantum changed:
> https://answers.launchpad.net/quantum/+question/225158
>
> Sumit Naiksatam posted a new comment:
> Here is the nova log as we reproduced this:
> http://paste.openstack.org/show/35783/
>
> Sorry Dan, I could not add any additional log statements in this run,
> but I am looking at the setup.
>
> --
> You received this question notification because you are an answer
> contact for quantum.
>

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dan Wendlandt
Nicira, Inc: www.nicira.com
twitter: danwendlandt
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Revision history for this message
Sumit Naiksatam (snaiksat) said :
#24
Revision history for this message
Aaron Rosen (arosen) said :
#25

Interesting, your nova-compute log shows there was a timeout. Vish was saying that he was thinking perhaps the port was already created (a timeout) vm rescheduled on another HV -- another port created.

Revision history for this message
Aaron Rosen (arosen) said :
#26

Interesting, your nova-compute log shows there was a timeout. Vish was saying that he was thinking perhaps the port was already created (a timeout) vm rescheduled on another HV -- another port created.

Revision history for this message
Sumit Naiksatam (snaiksat) said :
#27

Yes, there is an exception due to time out, and the earlier VM creation failed. But this is the log from the same node, so I am not sure how scheduling on another HV plays into this.

To me, the basic question is why is nova creating two interfaces for this VM (if you see the libvirt conf for this VM). It's not the case that nova creates one interface, but results in two ports being on the Quantum side. Nova is actually creating two interfaces and requesting two ports for those.

Revision history for this message
dan wendlandt (danwent) said :
#28

Sumit, this is because the order is actually the other way around. Think of it as being that nova contacts quantum to create ports, then when the nova virt layer is actually creating the VM, it looks at how many ports exist for instance X, and creates that many items in the XML. Since both ports are created with the same instance-id as the port's device_id, the underlying virt layer is tricked into thinking there are supposed to be two interfaces.

Revision history for this message
Sumit Naiksatam (snaiksat) said :
#29

Okay, but why is nova requesting that two ports be created just in this case when it should be asking for only one port?

Revision history for this message
Dan Mercer (dmercer) said :
#30

I know that we're chiming in on this a bit late, but we recently ran into this issue in our (grizzly+, nvp backed) environment as well.

Thanks for the patch (https://review.openstack.org/#/c/32691/).
We will experiment with the new configuration settings and see if this helps us avoid doubly-allocated ports.

I'd like to note that the patch above seems to only be a means of working around the underlying issue -- which AFAICT is coordination of state between nova and neutron. The failure of a discrete api call due to a timeout or some other issue should always be expected and duplicate/retry api calls should be able to be handled in an idempotent manner by the resource manager. Maybe I missed some part of the discussion where this was called out as a larger issue that needs to be addressed and the patch above is just an expedient fix, or perhaps I could be entirely off-base with my understanding of the issue.

Are there any ongoing (or even dusty!) discussions around the way state is managed between openstack services?

Revision history for this message
Aaron Rosen (arosen) said :
#31

Hi Dan,

I filled this blueprint a while ago but haven't gotten a chance to start implementing it https://blueprints.launchpad.net/nova/+spec/nova-api-quantum-create-port . I think this will help solve a lot these orchestration issues we have with how nova and quantum work together. What i'm planning for the blueprint is to basically force the clients to create a port in quantum first; then pass the port-id to nova when booting an instance (and to keep the api backwords compatitble allocate the port on the nova-api server if a network is passed in). This should make things simplier and help with this imo. Here is the discuession we had on the list a while ago about this: http://lists.openstack.org/pipermail/openstack-dev/2013-May/009088.html

Can you help with this problem?

Provide an answer of your own, or ask shu, xinxin for more information if necessary.

To post a message you must log in.