How to setup monitoring of vnf in Tacker

Asked by Anshul Sehgal

Hi,

I have tacker stable/pike setup through kolla-ansible. Openstack VIM is registered and REACHABLE. Able to deploy/terminate VNF from tacker.

When applying monitoring policy, it is not working. I see icmp packet from tacker server to VIM server, for VIM status monitoring, but same is not there for VNF.

Below is the section having monitoring policy.
------------------------------------
 properties:
    image: CentOS-7
    name: Self_Healing
    flavor: Tacker_Flavor
    availability_zone: nova
    mgmt_driver: noop
    monitoring_policy:
      name: ping
      parameters:
        monitoring_delay: 20
        count: 3
        interval: 0.2
        timeout: 2
        retry: 6
      actions:
        failure: respawn
-----------------------------------------------------

Even if I shut down the instance manually in VIM, vnf manager shows instance as ACTIVE. No trace of same in tacker-server, tacker-conductor, mistral logs.

In deployed vnf event tab, last event is with Resource state "Active", Event type "Monitor" with monitoring details. No workflow created to monitor VNF.

Running Container Services are:

tacker_server, tacker_conductor, barbican_worker, barbican_keystone_listener, barbican_api, mistral_executor, mistral_engine, mistral_api, horizon, keystone, rabbitmq, mariadb, cron, kolla_toolbox, fluentd and memcached.

Kindly suggest what should be checked.

I tried to refer https://docs.google.com/document/d/1TQUyEek_M1jbYK_nqiUuPVLurx4LhRnpv7ZUCFHGM5M/edit#heading=h.gjdgxs. Does zookeeper necessary for monitoring?

Question information

Language:
English Edit question
Status:
Solved
For:
tacker Edit question
Assignee:
Toshiaki Takahashi Edit question
Solved by:
Anshul Sehgal
Solved:
Last query:
Last reply:
Revision history for this message
Anshul Sehgal (anshul-sehgal) said :
#1

I see that we have to enable monitoring driver by adding it in setup.py. (https://opendev.org/openstack/tacker/commit/0e91cd11c8939a34eb3b7f75f6584032c67c9802)

My question, doesnt tacker support ping and http_ping by default?
If so, is it mentioned clearly anywhere?

Revision history for this message
Launchpad Janitor (janitor) said :
#2

This question was expired because it remained in the 'Open' state without activity for the last 15 days.

Revision history for this message
Toshiaki Takahashi (takahashi-tsc) said :
#3

Sorry for very late reply...

Tacker supports ping and http_ping by default.
I think you can use them without any monitor_driver setting.

(part of tacker.conf in my env)
------------------------------------------------------------------------
[tacker]

#
# From tacker.vnfm.monitor
#

# Monitor driver to communicate with Hosting VNF/logical service instance
# tacker plugin will use (list value)
#monitor_driver = ping,http_ping
------------------------------------------------------------------------

Your monitoring policy seems correct.
It works in latest Tacker correctly. (I show the entire VNFD I used at the end of this comment.)
Only tacker_conductor is required to work ping monitoring.

I'd like to check some points.

1. Would you show all VNFD?
I'm concerned that:
  - Mgmt IP is set correctly?
  - Do you set scaling policy together? Current Tacker has a bug about using scaling and ping monitoring together.

2. Would you check network connection between OpenStack Controller and VM?
Just ping VM's "Mgmt Ip Address" from Controller.

------------------------------------------------
$ openstack vnf list
+--------------------------------------+-------+-----------------------------+--------+--------------------------------------+--------------------------------------+
| ID | Name | Mgmt Ip Address | Status | VIM ID | VNFD ID |
+--------------------------------------+-------+-----------------------------+--------+--------------------------------------+--------------------------------------+
| 9223cf24-ea6e-4dff-b034-79836dd710a6 | myvnf | {"VDU1": "192.168.120.205"} | ACTIVE | 4f03c6f5-0bda-4a74-9b7d-5ba232d04fe2 | 547ebcdf-dd67-49bb-af70-3f6a52cd09a0 |
+--------------------------------------+-------+-----------------------------+--------+--------------------------------------+--------------------------------------+
------------------------------------------------

3. Would you check without 'retry' setting?
Tacker had a bug about setting it. (Now fixed).
I'm not sure that it was fixed or not in pike cycle.

-----------------------------------------------------------------------------------------------
tosca_definitions_version: tosca_simple_profile_for_nfv_1_0_0

description: Demo example

metadata:
  template_name: sample-tosca-vnfd

topology_template:
  node_templates:
    VDU1:
      type: tosca.nodes.nfv.VDU.Tacker
      properties:
        image: cirros-0.4.0-x86_64-disk
        name: test_healing
        flavor: m1.tiny
        availability_zone: nova
        mgmt_driver: noop
        monitoring_policy:
          name: ping
          parameters:
            monitoring_delay: 20
            count: 3
            interval: 0.2
            timeout: 2
            retry: 6
          actions:
            failure: respawn

    CP1:
      type: tosca.nodes.nfv.CP.Tacker
      properties:
        management: true
        order: 0
        anti_spoofing_protection: false
      requirements:
        - virtualLink:
            node: VL1
        - virtualBinding:
            node: VDU1

    CP2:
      type: tosca.nodes.nfv.CP.Tacker
      properties:
        order: 1
        anti_spoofing_protection: false
      requirements:
        - virtualLink:
            node: VL2
        - virtualBinding:
            node: VDU1

    CP3:
      type: tosca.nodes.nfv.CP.Tacker
      properties:
        order: 2
        anti_spoofing_protection: false
      requirements:
        - virtualLink:
            node: VL3
        - virtualBinding:
            node: VDU1

    VL1:
      type: tosca.nodes.nfv.VL
      properties:
        network_name: net_mgmt
        vendor: Tacker

    VL2:
      type: tosca.nodes.nfv.VL
      properties:
        network_name: net0
        vendor: Tacker

    VL3:
      type: tosca.nodes.nfv.VL
      properties:
        network_name: net1
        vendor: Tacker
-----------------------------------------------------------------------------------------------

Revision history for this message
Yasufumi Ogawa (yasufum) said :
#4

Thanks Toshiaki for your answer!

Revision history for this message
Anshul Sehgal (anshul-sehgal) said :
#5

Thanks Toshiaki for the information and sorry from late reply from my end too.

I will test this with new version train or ussuri.

Revision history for this message
Anshul Sehgal (anshul-sehgal) said :
#6

Hi,

I have deployed tacker stable/train.

Still I am not getting success in monitoring.

For management IP, I am using external-network.(As the default network interface IP is not reachable from tacker server)

So after deploying a vnf, I get an external IP attached to the vnf. I am able to ping that from tacker-conductor container.

Now if I shutOff the vnf in my openstack VIM, still Tacker shows it ACTIVE and doesnt respawn it.

VNF- Catalog
--------------------------------------------------------
description: Demo example
metadata:
  template_name: sample-tosca-vnfd
topology_template:
  node_templates:
    CP1:
      properties:
        anti_spoofing_protection: false
        management: true
        order: 0
      requirements:
      - virtualLink:
          node: VL1
      - virtualBinding:
          node: VDU1
      type: tosca.nodes.nfv.CP.Tacker
    VDU1:
      properties:
        availability_zone: nova
        flavor: Tacker_Flavor
        image: CentOS-root-login
        mgmt_driver: noop
        monitoring_policy:
          actions:
            failure: respawn
          name: ping
          parameters:
            count: 3
            interval: 0.2
            monitoring_delay: 20
            retry: 6
            timeout: 2
        name: test_healing
      type: tosca.nodes.nfv.VDU.Tacker
    VL1:
      properties:
        network_name: external-network
        vendor: Tacker
      type: tosca.nodes.nfv.VL
tosca_definitions_version: tosca_simple_profile_for_nfv_1_0_0

---------------------------------------------------------------

tacker vnf-list
===========
(virtual_env) [root@localhost mistral]# tacker vnf-list
Deprecated: tacker command line is deprecated, will be deleted after Rocky is released.
+--------------------------------------+------------+---------------------------+--------+--------------------------------------+--------------------------------------+
| id | name | mgmt_ip_address | status | vim_id | vnfd_id |
+--------------------------------------+------------+---------------------------+--------+--------------------------------------+--------------------------------------+
| 7daa4d3a-610d-4146-8035-592531cb9d06 | Monitoring | {"VDU1": "10.152.10.141"} | ACTIVE | a3708b03-65be-4845-89a1-2a55a943dda6 | a1e57a4a-0f7e-441c-b6a2-810cdf2c2896 |
+--------------------------------------+------------+---------------------------+--------+--------------------------------------+--------------------------------------+

tacker vnf-show Monitoring
=========================
(virtual_env) [root@localhost mistral]# tacker vnf-show Monitoring
Deprecated: tacker command line is deprecated, will be deleted after Rocky is released.
+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| attributes | {"monitoring_policy": "{\"vdus\": {\"VDU1\": {\"ping\": {\"actions\": {\"failure\": \"respawn\"}, \"name\": \"ping\", \"parameters\": {\"count\": 3, \"interval\": 0.2, \"monitoring_delay\": 20, \"retry\": 6, \"timeout\": 2}, \"monitoring_params\": {\"count\": 3, \"interval\": 0.2, \"monitoring_delay\": 20, \"retry\": 6, \"timeout\": 2}}}}}", "heat_template": "heat_template_version: 2013-05-23\ndescription: 'Demo example\n\n '\nparameters: {}\nresources:\n CP1:\n type: OS::Neutron::Port\n properties:\n port_security_enabled: false\n network: external-network\n VDU1:\n type: OS::Nova::Server\n properties:\n user_data_format: SOFTWARE_CONFIG\n name: test_healing\n availability_zone: nova\n image: CentOS-root-login\n flavor: Tacker_Flavor\n networks:\n - port:\n get_resource: CP1\n config_drive: false\noutputs:\n mgmt_ip-VDU1:\n value:\n get_attr:\n - CP1\n - fixed_ips\n - 0\n - ip_address\n"} |
| created_at | 2020-08-06 14:01:03 |
| description | Demo example |
| error_reason | |
| id | 7daa4d3a-610d-4146-8035-592531cb9d06 |
| instance_id | 81140836-8522-4b98-ab7c-381a038d79c0 |
| mgmt_ip_address | {"VDU1": "10.152.10.141"} |
| name | Monitoring |
| placement_attr | {"vim_name": "Staging_OpenStack"} |
| status | ACTIVE |
| tenant_id | c7fdb952003c49b199a25012b1fd9a1e |
| updated_at | |
| vim_id | a3708b03-65be-4845-89a1-2a55a943dda6 |
| vnfd_id | a1e57a4a-0f7e-441c-b6a2-810cdf2c2896 |
+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Revision history for this message
Anshul Sehgal (anshul-sehgal) said :
#7
Revision history for this message
Anshul Sehgal (anshul-sehgal) said :
#8

Hi,

There is one observation.

If I restart the tacker-server container, then I see the monitoring in tacker-server logs.
Then if I delete the vnf, tacker doesnt stop monitoring. It keeps giving error that VNF ip is not reachable. If I again restart the tacker-server container, that error also goes off.

Seems like some service is not getting triggered at the time of deployment. It only comes in action when we restart the tacker service.

Revision history for this message
Anshul Sehgal (anshul-sehgal) said :
#9

Hi,

Can anyone help in this?

Revision history for this message
Toshiaki Takahashi (takahashi-tsc) said :
#10

Sorry for the late again.

Would you check the setting of tacker worker number?

-------------------------------------------------------------------------
$ grep worker /etc/tacker/tacker.conf
# Number of separate worker processes for service (integer value)
#api_workers = 0
# (``swift_store_large_object_chunk_size`` * ``workers`` * 1000)
-------------------------------------------------------------------------

I haven't reproduced your trouble completely yet.
However, when I change the above "api_workers" setting to 3, ping monitoring does not work in my env.
And after I restart tacker-server, monitoring work again.

I think this setting is not working properly.
I will continue to investigate it.

Revision history for this message
Anshul Sehgal (anshul-sehgal) said :
#11

Thanks Toshiaki for your revert.

Below are the required details.

(tacker-server)[tacker@localhost /]$ grep worker /etc/tacker/tacker.conf
api_workers = 5

Note: This setting is not modified manually.

Complete tacker.conf (I should have provided it in the beginning. Sorry for missing it out.)
==============================
cat /etc/kolla/tacker-server/tacker.conf

[DEFAULT]
debug = True
log_dir = /var/log/kolla/tacker
transport_url = rabbit://openstack:JyWcpZKnE2oNEfB1bGFQFCFZyUf9DXP9N3DDLrLj@10.152.5.75:5672//
bind_host = 10.152.5.75
bind_port = 9890
api_workers = 5
service_plugins = nfvo,vnfm

[nfvo]
vim_drivers = openstack

[openstack_vim]
stack_retries = 60
stack_retry_wait = 10

[vim_keys]
use_barbican = True

[tacker]
monitor_driver = ping,http_ping
alarm_monitor_driver = ceilometer

[database]
connection = mysql+pymysql://tacker:qEmk1MKVlgMBkGF9l94ptnEHlYmLJfPNxWv2asim@10.152.5.75:3306/tacker
max_retries = -1

[keystone_authtoken]
www_authenticate_uri = http://10.152.5.75:5000
auth_url = http://10.152.5.75:35357
auth_type = password
project_domain_name = default
user_domain_name = default
project_name = service
username = tacker
password = 7orSBqFSVhrQWCx781wSYHGoxqWWn9bqLJsy60JV
memcache_security_strategy = ENCRYPT
memcache_secret_key = Hcz0f1rpX6fqdWg0O7wBi2FvTYT9PPCEk9UtE3ti
memcached_servers = 10.152.5.75:11211

[alarm_auth]
username = tacker
password = 7orSBqFSVhrQWCx781wSYHGoxqWWn9bqLJsy60JV
project_name = service
url = http://10.152.5.75:35357

[ceilometer]
host = 10.152.5.75
port = 9890

[oslo_messaging_notifications]
transport_url = rabbit://openstack:JyWcpZKnE2oNEfB1bGFQFCFZyUf9DXP9N3DDLrLj@10.152.5.75:5672//
driver = messagingv2
topics = notifications

[glance_store]
filesystem_store_datadir = /var/lib/tacker/csar_files

I will also check with commenting that parameter(#api_workers = 0) and will update results here.

Revision history for this message
Anshul Sehgal (anshul-sehgal) said :
#12

Hi Toshiaki,

Please find below update after commenting (#api_workers = 5).

After commenting, I restart tacker_server and tacker_conductor too.

After this, I am seeing ping monitor working.

I dont need to restart the tacker_server container anymore. I see ICMP packets for both VIM and my vnf.

If I terminate the VNF, then monitor also stops.

If I shut off the VM from openstack VIM, Tacker respawn VM with same IP.

I am very happy to see this working for the first time since I started working on Tacker. Thank you very much.

I have tested this with single VDU. Will check with multi VDU also.

I would request if you can guide that commenting #api_workers = 5 is standard configuration. Means it will not block any major feature of tacker.

Revision history for this message
Anshul Sehgal (anshul-sehgal) said :
#13

Hi,

I have tested multi VDU monitoring also. This is also working fine.(Shutting off single instance cause whole stack to reinitialize)

Also tested multi VDU monitoring with vdu_autoheal policy.(Shutting off single instance just delete and re-create the faulty instance).

In docs(https://specs.openstack.org/openstack/tacker-specs/specs/stein/vdu-auto-healing.html), policy name is written as vdu_autohealing which is wrong. Correct policy name is vdu_autoheal(Found at https://github.com/openstack/tacker/blob/master/samples/tosca-templates/vnfd/tosca-vnfd-monitoring-vdu-autoheal.yaml)

Revision history for this message
Toshiaki Takahashi (takahashi-tsc) said :
#14

Thank you for your useful information!

First, we will fix the document.
I also think the problem is that the "api_workers" does not work correctly.
I'll write docs which explain that commenting it out is recommended, and also will consider making the "api_workers" work correct.

Revision history for this message
Anshul Sehgal (anshul-sehgal) said :
#15

Thank you very much Toshiaki.

Revision history for this message
JohnJohnson (johnjo-hnson88) said :
#16

Thank you for such a detailed explanation. I followed your advice. I did it.