NSX Ansible Updates

It has been a hectic few months for me. I relocated my family to Colorado last month, and as a result all of my side-projects were put on the back-burner. During the time I was away, I was selected for the fourth year in a row as a vExpert! I am grateful to be a part of this awesome community! I strive to make the work I do and submit here worthwhile and informative for others.

Now that I am back into the swing of things, I was able to jump back into improving the NSX-v Ansible module. Recently a member of the community opened an issue regarding the implementation method I had used for Edge NAT rules. Sure enough, they were correct in that the method I was using was really an append and not a creation.

Note: I wrote and tested the code against a pre-release version of NSX-v 6.4.1. There are no documented differences between the two API calls used in the nsx_edge_nat.py Ansible module in NSX 6.4.x or 6.3.x

When I looked at the NSX API, I realized there were two methods for adding NAT rules to an NSX Edge:

PUT /api/4.0/edges/{edgeId}/nat/config

URI Parameters:
edgeId (required) Specify the ID of the edge in edgeId.
Configure NAT rules for an Edge.

If you use this method to add new NAT rules, you must include all existing rules in the request body. Any rules that are omitted will be deleted.

And also:

POST /api/4.0/edges/{edgeId}/nat/config/rules

URI Parameters:
edgeId (required)
Specify the ID of the edge in edgeId.

Query Parameters:
aboveRuleId (optional)
Specified rule ID. If no NAT rules exist, you can specify rule ID 0.

Add a NAT rule above a specific rule in the NAT rules table (using aboveRuleId query parameter) or append NAT rules to the bottom.

The original code was using the second method, which meant each time an Ansible playbook was run, it would return an OK status because it was adding the rules — even if they already existed.

I decided to dive into the issue and spent a few more hours than I anticipated improving the code. It is now possible to use either method — one to create a full set of rules (deleting any existing rules) or appending new rules to the existing ruleset.

In order to create multiple rules, I modified how the Ansible playbook is interpreted. The example is in the readme.md file, but I want to highlight it here:

  1 ---
  2 - hosts: localhost
  3   connection: local
  4   gather_facts: False
  5   vars_files:
  6     - nsxanswer.yml
  7     - envanswer.yml
  9   tasks:
 10   - name: Create NAT rules
 11     nsx_edge_nat:
 12       nsxmanager_spec: '{{ nsxmanager_spec }}'
 13       mode: 'create'
 14       name: '{{ edge_name }}'
 15       rules:
 16         dnat0: { description: 'Ansible created HTTP NAT rule',
 17             loggingEnabled: 'true',
 18             rule_type: 'dnat',
 19             nat_enabled: 'true',
 20             dnatMatchSourceAddress: 'any',
 21             dnatMatchSourcePort: 'any',
 22             vnic: '0',
 23             protocol: 'tcp',
 24             originalAddress: '',
 25             originalPort: '80',
 26             translatedAddress: '',
 27             translatedPort: '80'
 28           }
 29         dnat1: { description: 'Ansible created HTTPS NAT rule',
 30             loggingEnabled: 'true',
 31             rule_type: 'dnat',
 32             vnic: '0',
 33             nat_enabled: 'true',
 34             dnatMatchSourceAddress: 'any',
 35             dnatMatchSourcePort: 'any',
 36             protocol: 'tcp',
 37             originalAddress: '',
 38             originalPort: '443',
 39             translatedAddress: '',
 40             translatedPort: '443'
 41           }

Please note the identifiers, dnat0 and dnat1, are merely that — identifiers for your playbook. They do not influence the API call made to the NSX Manager.

A new function was required in the Ansible module to allow for multiple rules to be appended to one another to make a single API call that would add each rule. The data structure used to create this dictionary of lists was rather convoluted since Python struggles to convert these sort of thing to XML properly. With some help from a few people in the NSBU, I was able to get it working.

def create_init_nat_rules:

 55 def create_init_nat_rules(client_session, module):
 56     """
 57     Create single dictionary with all of the NAT rules, both SNAT and DNAT, to be used
 58     in a single API call. Should be used when wiping out ALL existing rules or when
 59     a new NSX Edge is created.
 60     :return: return dictionary with the full NAT rules list
 61     """
 62     nat_rules = module.params['rules']
 63     params_check_nat_rules(module)
 65     nat_rules_info = {}
 66     nat_rules_info['natRule'] = []
 68     for rule_key, nat_rule in nat_rules.items():
 69         rules_index = rule_key[-1:]
 70         rule_type = nat_rule['rule_type']
 71         if rule_type == 'snat':
 72             nat_rules_info['natRule'].append(
 73                                     {'action': rule_type, 'vnic': nat_rule['vnic'], 'originalAddress': nat_rule['originalAddress'],
 74                                      'translatedAddress': nat_rule['translatedAddress'], 'loggingEnabled': nat_rule['loggingEnabled'],
 75                                      'enabled': nat_rule['nat_enabled'], 'protocol': nat_rule['protocol'], 'originalPort': nat_rule['originalPort'],
 76                                      'translatedPort': nat_rule['translatedPort'], 'snatMatchDestinationAddress': nat_rule['snatMatchDestinationAddress'],
 77                                      'snatMatchDestinationPort': nat_rule['snatMatchDestinationPort'], 'description': nat_rule['description']
 78                                     }
 79                                   )
 80         elif rule_type == 'dnat':
 81             nat_rules_info['natRule'].append(
 82                                     {'action': rule_type, 'vnic': nat_rule['vnic'], 'originalAddress': nat_rule['originalAddress'],
 83                                      'translatedAddress': nat_rule['translatedAddress'], 'loggingEnabled': nat_rule['loggingEnabled'],
 84                                      'enabled': nat_rule['nat_enabled'], 'protocol': nat_rule['protocol'], 'originalPort': nat_rule['originalPort'],
 85                                      'translatedPort': nat_rule['translatedPort'], 'dnatMatchSourceAddress': nat_rule['dnatMatchSourceAddress'],
 86                                      'dnatMatchSourcePort': nat_rule['dnatMatchSourcePort'], 'description': nat_rule['description']
 87                                     }
 88                                   )
 90         if nat_rule['protocol'] == 'icmp':
 91             nat_rules_info['natRule']['icmpType'] = nat_rule['icmpType']
 93     return nat_rules_info

I also took the opportunity to clean up some of the excessively long lines of code to make it more clearly readable. The result is a new working playbook for initial Edge NAT rule creation and the ability to add new rules later. There are a few items that remain where I would like to see some improvements — mainly I would like to add logic in to the code that, when performing an append, it will check to see if the rule already exists and skip it.

In the meantime, the code has been checked into GitHub and the Docker image I use for running Ansible has been updated.


Docker for Ansible + VMware NSX Automation

I am writing this as I sit and watch the annual viewing of The Hobbit and The Lord of the Rings trilogy over the Christmas holiday. The next couple of weeks of time should provide the time necessary to hopefully complete the Infrastructure-as-Code project I undertook last month. As part of the Infrastructure-as-Code project, I spoke previous about how Ansible is being used to provide the automation layer for the deployment and configuration of the SDDC Kubernetes stack. As part of the bootstrapping effort, I have decided to create a Docker image with the necessary components to perform the initial virtual machine deployment and NSX configuration.

The Dockerfile for the Ubuntu-based Docker container is hosted both on Docker Hub and within the Github repository for the larger Infrastructure-as-Code project.

When the Docker container is launched, it includes the necessary components to interact with the VMware stack, including additional modules for VM folders, resource pools and VMware NSX.

To launch the container, I am running it with the following options to include the local copies of the Infrastructure-as-Code project.

$ docker run -it --name ansible -v /Users/cmutchler/github/vsphere-kubernetes/ansible/:/opt/ansible virtualelephant/ubuntu-ansible

The Docker container is a bit on the larger side, but it is designed to run locally on a laptop or desktop. The image includes the required Python and NSX bits so that the additional Github repositories that are cloned into the image will operate correctly. The OpenShift project includes additional modules for interacting with vSphere folders and resource pools, while the NSX modules from the VMware Github repository includes the necessary bits for leveraging Ansible with NSX.

Once running, the Docker container is then able to bootstrap the deployment of the Infrastructure-as-Code project using the Ansible playbooks I’ve published on Github. Enjoy!

Infrastructure-as-Code: Ansible for VMware NSX

As the project moves into the next phase, Ansible is beginning to be relied upon for the deployment of the individual components that will define the environment. This installment of the series is going to cover the use of Ansible with VMware NSX. VMware has provided a set of Ansible modules for integrating with NSX on GitHub. The modules easily allow the creation of NSX Logical Switches, NSX Distributed Logical Routers, NSX Edge Services Gateways (ESG) and many other components.

The GitHub repository can be found here.

Step 1: Installing Ansible NSX Modules

In order to support the Ansible NSX modules, it was necessary to install several supporting packages on the Ubuntu Ansible Control Server (ACS).

$ sudo apt-get install python-dev libxml2 libxml2-dev libxslt1-dev zlib1g-dev npm
$ sudo pip install nsxramlclient
$ sudo npm install -g https://github.com/yfauser/raml2html
$ sudo npm install -g https://github.com/yfauser/raml2postman
$ sudo npm install -g raml-fleece

In addition to the Ansible NSX modules, the ACS server will also require the vSphere for NSX RAML repository. The RAML specification includes information on the NSX for vSphere API. The repo will need to be cloned to a local directory on the ACS as well before execution of an Ansible Playbook will work.

Now that all of the prerequisites are met, the Ansible playbook for creating the NSX components can be written.

Step 2: Ansible Playbook for NSX

The first thing to know is the GitHub repo for the NSX modules include many great examples within the test_*.yml files which were leveraged to create the playbook below. To understand what the Ansible Playbook has been written to create, let’s first review the logical network design for the Infrastructure-as-Code project.


The design calls for three layers of NSX virtual networking to exist — the NSX ECMP Edges, the Distributed Logical Router (DLR) and the Edge Services Gateway (ESG) for the tenant. The Ansible Playbook below assumes the ECMP Edges and DLR already exist. The playbook will focus on creating the HA Edge for the tenant and configuring the component services (SNAT/DNAT, DHCP, routing).

The GitHub repository for the NSX Ansible modules provides many great code examples. The playbook that I’ve written to create the k8s_internal logical switch and the NSX HA Edge (aka ESG) took much of the content provided and collapsed it into a single playbook. The NSX playbook I’ve written can be found in the Virtual Elephant GitHub repository for the Infrastructure-as-Code project.

As I’ve stated, this project is mostly about providing me a detailed game plan for learning several new (to me) technologies, including Ansible. The NSX playbook is the first time I’ve used an answer file to obfuscate several of the sensitive variables needed specifically for my environment. The nsxanswer.yml file includes the variable required for connecting to the NSX Manager, which is the component Ansible will be communicating with to create the logical switch and ESG.

Ansible Answer File: nsxanswer.yml (link)

  1 nsxmanager_spec:
  2         raml_file: '/HOMEDIR/nsxraml/nsxvapi.raml'
  3         host: 'usa1-2-nsxv'
  4         user: 'admin'
  5         password: 'PASSWORD'

The nsxvapi.raml file is the API specification file that we cloned in step 1 from the GitHub repository. The path should be modified for your local environment, as should the password: variable line for the NSX Manager.

Ansible Playbook: nsx.yml (link)

  1 ---
  2 - hosts: localhost
  3   connection: local
  4   gather_facts: False
  5   vars_files:
  6     - nsxanswer.yml
  7   vars_prompt:
  8   - name: "vcenter_pass"
  9     prompt: "Enter vCenter password"
 10     private: yes
 11   vars:
 12     vcenter: "usa1-2-vcenter"
 13     datacenter: "Lab-Datacenter"
 14     datastore: "vsanDatastore"
 15     cluster: "Cluster01"
 16     vcenter_user: "administrator@vsphere.local"
 17     switch_name: "{{ switch }}"
 18     uplink_pg: "{{ uplink }}"
 19     ext_ip: "{{ vip }}"
 20     tz: "tzone"
 22   tasks:
 23   - name: NSX Logical Switch creation
 24     nsx_logical_switch:
 25       nsxmanager_spec: "{{ nsxmanager_spec }}"
 26       state: present
 27       transportzone: "{{ tz }}"
 28       name: "{{ switch_name }}"
 29       controlplanemode: "UNICAST_MODE"
 30       description: "Kubernetes Infra-as-Code Tenant Logical Switch"
 31     register: create_logical_switch
 33   - name: Gather MOID for datastore for ESG creation
 34     vcenter_gather_moids:
 35       hostname: "{{ vcenter }}"
 36       username: "{{ vcenter_user }}"
 37       password: "{{ vcenter_pass }}"
 38       datacenter_name: "{{ datacenter }}"
 39       datastore_name: "{{ datastore }}"
 40       validate_certs: False
 41     register: gather_moids_ds
 42     tags: esg_create
 44   - name: Gather MOID for cluster for ESG creation
 45     vcenter_gather_moids:
 46       hostname: "{{ vcenter }}"
 47       username: "{{ vcenter_user }}"
 48       password: "{{ vcenter_pass }}"
 49       datacenter_name: "{{ datacenter }}"
 50       cluster_name: "{{ cluster }}"
 51       validate_certs: False
 52     register: gather_moids_cl
 53     tags: esg_create
 55   - name: Gather MOID for uplink
 56     vcenter_gather_moids:
 57       hostname: "{{ vcenter }}"
 58       username: "{{ vcenter_user}}"
 59       password: "{{ vcenter_pass}}"
 60       datacenter_name: "{{ datacenter }}"
 61       portgroup_name: "{{ uplink_pg }}"
 62       validate_certs: False
 63     register: gather_moids_upl_pg
 64     tags: esg_create
 66   - name: NSX Edge creation
 67     nsx_edge_router:
 68       nsxmanager_spec: "{{ nsxmanager_spec }}"
 69       state: present
 70       name: "{{ switch_name }}-edge"
 71       description: "Kubernetes Infra-as-Code Tenant Edge"
 72       resourcepool_moid: "{{ gather_moids_cl.object_id }}"
 73       datastore_moid: "{{ gather_moids_ds.object_id }}"
 74       datacenter_moid: "{{ gather_moids_cl.datacenter_moid }}"
 75       interfaces:
 76         vnic0: {ip: "{{ ext_ip }}", prefix_len: 26, portgroup_id: "{{ gather_moids_upl_pg.object_id }}", name: 'uplink0', iftype: 'uplink', fence_param: 'ethernet0.filter1.param1=1'}
 77         vnic1: {ip: '', prefix_len: 20, portgroup_id: "{{ switch_name }}", name: 'int0', iftype: 'internal', fence_param: 'ethernet0.filter1.param1=1'}
 78       default_gateway: "{{ gateway }}"
 79       remote_access: 'true'
 80       username: 'admin'
 81       password: "{{ nsx_admin_pass }}"
 82       firewall: 'false'
 83       ha_enabled: 'true'
 84     register: create_esg
 85     tags: esg_create

The playbook expects to be provided three extra variables from the CLI when it is executed — switch, uplink and vip. The switch variable defines the name of the logical switch, the uplink variable defines the uplink VXLAN portgroup the tenant ESG will connect to, and the vip variable is the external VIP to be assigned from the network block. At the time of this writing, these sorts of variables continue to be command-line based, but will likely be moved to a single Ansible answer file as the project matures. Having a single answer file for the entire set of playbooks should simplify the adoption of the Infrastructure-as-Code project into other vSphere environments.

Now that Ansible playbooks exist for creating the NSX components and the VMs for the Kubernetes cluster, the next step will be to begin configuring the software within CoreOS to run Kubernetes.

Stay tuned.

Infrastructure-as-Code: Project Overview

In an effort to get caught-up with the Cloud Native space, I am embarking on an effort to build a completely dynamic Kubernetes environment entirely through code. To accomplish this, I am using (and learning) several technologies, including:

  • Container OS (CoreOS) for the Kubernetes nodes.
  • Ignition for configuring CoreOS.
  • Ansible for automation and orchestration.
  • Kubernetes
  • VMware NSX for micro-segmention, load balancing and DHCP.

There are a lot of great articles on the Internet around Kubernetes, CoreOS and other Cloud Native technologies. If you are unfamiliar with Kubernetes, I highly encourage you to read the articles written by Hany Michaels (Kubernetes Introduction for VMware Users and Kubernetes in the Enterprise – The Design Guide). These are especially useful if you already have a background in VMware technologies and are just getting started in the Cloud Native space. Mr. Michaels does an excellent job comparing concepts you are already familiar with and aligning them with Kubernetes components.

Moving on, the vision I have for this Infrastructure-as-Code project is to build a Kubernetes cluster leveraging my vSphere lab with the SDDC stack (vSphere, vCenter, vSAN and NSX). I want to codify it in a way that an environment can be stood up or torn down in a matter of minutes without having to interact with any user-interface. I am also hopeful the lessons learned whilst working on this project will be applicable to other cloud native technologies, including Mesos and Cloud Foundry environments.

Logically, the project will create the following within my vSphere lab environment:


I will cover the NSX components in a future post, but essentially each Kubernetes environment will be attached to a HA pair of NSX Edges. The ECMP Edges and Distributed Logical Router are already in place, as they are providing upstream network connectivity for my vSphere lab. The project will focus on the internal network (VXLAN-backed), attached to the NSX HA Edge devices, which will provide the inter-node network connectivity. The NSX Edge is configured to provide firewall, routing and DHCP services to all components inside its network space.

The plan for the project and the blog series is to document every facet of development and execution of the components, with the end goal being the ability of anyone reading the series to understand how all the pieces interrelate with one another. The series will kickoff with the following posts:

  • Bootstrapping CoreOS with Ignition
  • Understanding Ignition files
  • Using Ansible with Ignition
  • Building Kubernetes cluster with Ansible
  • Deploying NSX components using Ansible
  • Deploying full stack using Ansible

If time allows, I may also embark on migrating from NSX-V to NSX-T for providing some of the tenant software-defined networking.

I hope you enjoy the series!

[Introduction] [Part 1 – Bootstrap CoreOS with Ignition] [Part 2 – Understanding CoreOS Ignition] [Part 3 – Getting started with Ansible]


Troubleshoot OpenStack Neutron + NSX


OpenStack Neutron likes to use some pretty awesome reference IDs for the tenant network objects. You know, helpful strings like ec43c520-bfc6-43d5-ba2b-d13b4ef5a760. The first time I saw that, I said to myself that is going to be a nightmare when trying to troubleshoot an issue.


Fortunately, VMware NSX also uses a similar character string when it creates logical switches. If NSX is being used in conjunction with OpenStack Neutron, magic happens. The logical switch is created with a string like vxw-dvs-9-virtualwire-27-sid-10009-ec43c520-bfc6-43d5-ba2b-d13b4ef5a760.


A keen eye will have noticed the OpenStack Neutron reference ID is included in the NSX logical switch name!

From there you can reference the NSX Edge virtual machines and see which interface the NSX logical switch is attached to. This tidbit of information proved useful today when I was troubleshooting an issue for a developer and is a piece of information going into my VCDX SOP document.