Enhanced NSX Modules for Ansible

The published NSX modules from VMware lack certain functionality that I’ve needed as I worked on the Infrastructure-as-Code project over the holiday break. A few of the things I need to be able to do include:

  • Enable Edge firewall and add/delete rules
  • Enable DHCP and add IP pools
  • Search for DVS VXLAN port group
  • Associate vNIC with default gateway
  • Create Edge SNAT/DNAT rules

As I investigated methods for accomplishing these tasks, I found another VMware repository of Python scripts that had some of the functionality. The library is designed as a command-line tool, but I was able to take several of the code blocks and modify them for use within an Ansible playbook. In order to track the changes that I’ve made to the Ansible modules, I’ve forked the vmware/nsxansible repo into virtualelephant/nsxansible on Github. After a bit of work over the holiday, I’ve managed to add functionality for all but the SNAT/DNAT rule creation.

In addition to writing the Python modules, I have modified the Docker ubuntu-ansible container I spoke about previously to include my forked branch of the vmware/nsxansible modules.

Creating DHCP Pools

The module nsx_edge_dhcp.py allows an Ansible playbook to create a new IP pool and enable the DHCP service on a previously deployed NSX Edge. The playbook can currently support all of the basic IP pool options, as seen in the following image:

The playbook can contain the following code:

  1 ---
  2 - hosts: localhost
  3   connection: local
  4   gather_facts: False
  5   vars_files:
  6     - nsxanswer.yml
  7     - envanswer.yml
  9   tasks:
 10   - name: Create DHCP pool on NSX Edge
 11     nsx_edge_dhcp:
 12       nsxmanager_spec: "{{ nsxmanager_spec }}"
 13       name: '{{ edge_name }}'
 14       mode: 'create_pool'
 15       ip_range: '{{ ip_range }}'
 16       subnet: '{{ netmask }}'
 17       default_gateway: '{{ gateway }}'
 18       domain_name: '{{ domain }}'
 19       dns_server_1: '{{ dns1_ip }}'
 20       dns_server_2: '{{ dns2_ip }}'
 21       lease_time: '{{ lease_time }}'
 22       next_server: '{{ tftp_server }}'
 23       bootfile: '{{ bootfile }}'

A future enhancement to the module will allow for the DHCP Options variables to be updated as well. This is key for the project so that the scope points to the TFTP server where Core OS is installed from.

Update: The ability to add a TFTP next-server and specify the filename for downloading has been added and is contained in the Github repo virtualelephant/nsxansible.

Edge Firewall Rules

Fortunately, another author on Github already wrote this module — they even submitted a pull request to have it included in the vmware/nsxansible repo last year, but since it had yet to be included, I forked it into my own repo for use.

The nsx_edge_firewall.py module allows you to modify the default rule and create new rules on a NSX Edge device.

The Ansible playbook contains the following to create the default firewall policy:

  1 ---
  2 - hosts: localhost
  3   connection: local
  4   gather_facts: False
  5   vars_files:
  6     - nsxanswer.yml
  7     - envanswer.yml
  9   tasks:
 10   - name: Set default firewall rule policy
 11     nsx_edge_firewall:
 12       nsxmanager_spec: "{{ nsxmanager_spec }}"
 13       mode: 'set_default_action'
 14       edge_name: '{{ edge_name }}'
 15       default_action: 'accept'

Specify vNIC for Default Gateway

The original nsx_edge_router.py module included code to create the default gateway, however it did not allow you to modify the MTU or specify which vNIC should be associated with the default gateway. The forked nsx_edge_router.py version in the VirtualElephant Github repo includes the necessary code to specify both of those options.

150 def config_def_gw(client_session, esg_id, dfgw, vnic, mtu, dfgw_adminDistance):
151     if not mtu:
152         mtu = '1500'
153     rtg_cfg = client_session.read('routingConfigStatic', uri_parameters={'edgeId': esg_id})['body']
154     if dfgw:
155         try:
156             rtg_cfg['staticRouting']['defaultRoute'] = {'gatewayAddress': dfgw, 'vnic': vnic, 'mtu': mtu, 'adminDistance': dfgw_adminDistance}
157         except KeyError:
158             rtg_cfg['staticRouting']['defaultRoute'] = {'gatewayAddress': dfgw, 'vnic': vnic, 'adminDistance': dfgw_adminDistance, 'mtu': mtu}
159     else:
160         rtg_cfg['staticRouting']['defaultRoute'] = None
162     cfg_result = client_session.update('routingConfigStatic', uri_parameters={'edgeId': esg_id},
163                                        request_body_dict=rtg_cfg)
164     if cfg_result['status'] == 204:
165         return True
166     else:
167         return False

The Ansible playbook is then able to include the following bits to create the default gateway with the preferred settings:

 42   - name: NSX Edge creation
 43     nsx_edge_router:
 44       nsxmanager_spec: "{{ nsxmanager_spec }}"
 45       state: present
 46       name: "{{ edge_name }}"
 47       description: "{{ description }}"
 48       resourcepool_moid: "{{ gather_moids_cl.object_id }}"
 49       datastore_moid: "{{ gather_moids_ds.object_id }}"
 50       datacenter_moid: "{{ gather_moids_cl.datacenter_moid }}"
 51       interfaces:
 52         vnic0: {ip: "{{ ext_ip }}", prefix_len: 26, logical_switch: "{{ uplink }}", name: 'uplink0', iftype: 'uplink', fence_param: 'ethernet0.filter1.param1=1'}
 53         vnic1: {ip: '', prefix_len: 20, logical_switch: "{{ switch_name }}", name: 'int0', iftype: 'internal', fence_param: 'ethernet0.filter1.param1=1'}
 54       default_gateway: "{{ default_route }}"
 55       default_gateway_vnic: '0'
 56       mtu: '9000'
 57       remote_access: 'true'
 58       username: 'admin'
 59       password: "{{ nsx_pass }}"
 60       firewall: 'true'
 61       ha_enabled: 'true'
 62     register: create_esg
 63     tags: esg_create

When specifying the vNIC to use for the default gateway, the value is not the name the Ansible playbook gives the vNIC — uplink0 — but rather the vNIC number within the Edge — which will be 0 if you are using my playbook.

Once I have the SNAT/DNAT functionality added, I will write another blog post and progress on the Infrastructure-as-Code project will be nearly complete.



Docker for Ansible + VMware NSX Automation

I am writing this as I sit and watch the annual viewing of The Hobbit and The Lord of the Rings trilogy over the Christmas holiday. The next couple of weeks of time should provide the time necessary to hopefully complete the Infrastructure-as-Code project I undertook last month. As part of the Infrastructure-as-Code project, I spoke previous about how Ansible is being used to provide the automation layer for the deployment and configuration of the SDDC Kubernetes stack. As part of the bootstrapping effort, I have decided to create a Docker image with the necessary components to perform the initial virtual machine deployment and NSX configuration.

The Dockerfile for the Ubuntu-based Docker container is hosted both on Docker Hub and within the Github repository for the larger Infrastructure-as-Code project.

When the Docker container is launched, it includes the necessary components to interact with the VMware stack, including additional modules for VM folders, resource pools and VMware NSX.

To launch the container, I am running it with the following options to include the local copies of the Infrastructure-as-Code project.

$ docker run -it --name ansible -v /Users/cmutchler/github/vsphere-kubernetes/ansible/:/opt/ansible virtualelephant/ubuntu-ansible

The Docker container is a bit on the larger side, but it is designed to run locally on a laptop or desktop. The image includes the required Python and NSX bits so that the additional Github repositories that are cloned into the image will operate correctly. The OpenShift project includes additional modules for interacting with vSphere folders and resource pools, while the NSX modules from the VMware Github repository includes the necessary bits for leveraging Ansible with NSX.

Once running, the Docker container is then able to bootstrap the deployment of the Infrastructure-as-Code project using the Ansible playbooks I’ve published on Github. Enjoy!

Infrastructure-as-Code: Ansible for VMware NSX

As the project moves into the next phase, Ansible is beginning to be relied upon for the deployment of the individual components that will define the environment. This installment of the series is going to cover the use of Ansible with VMware NSX. VMware has provided a set of Ansible modules for integrating with NSX on GitHub. The modules easily allow the creation of NSX Logical Switches, NSX Distributed Logical Routers, NSX Edge Services Gateways (ESG) and many other components.

The GitHub repository can be found here.

Step 1: Installing Ansible NSX Modules

In order to support the Ansible NSX modules, it was necessary to install several supporting packages on the Ubuntu Ansible Control Server (ACS).

$ sudo apt-get install python-dev libxml2 libxml2-dev libxslt1-dev zlib1g-dev npm
$ sudo pip install nsxramlclient
$ sudo npm install -g https://github.com/yfauser/raml2html
$ sudo npm install -g https://github.com/yfauser/raml2postman
$ sudo npm install -g raml-fleece

In addition to the Ansible NSX modules, the ACS server will also require the vSphere for NSX RAML repository. The RAML specification includes information on the NSX for vSphere API. The repo will need to be cloned to a local directory on the ACS as well before execution of an Ansible Playbook will work.

Now that all of the prerequisites are met, the Ansible playbook for creating the NSX components can be written.

Step 2: Ansible Playbook for NSX

The first thing to know is the GitHub repo for the NSX modules include many great examples within the test_*.yml files which were leveraged to create the playbook below. To understand what the Ansible Playbook has been written to create, let’s first review the logical network design for the Infrastructure-as-Code project.


The design calls for three layers of NSX virtual networking to exist — the NSX ECMP Edges, the Distributed Logical Router (DLR) and the Edge Services Gateway (ESG) for the tenant. The Ansible Playbook below assumes the ECMP Edges and DLR already exist. The playbook will focus on creating the HA Edge for the tenant and configuring the component services (SNAT/DNAT, DHCP, routing).

The GitHub repository for the NSX Ansible modules provides many great code examples. The playbook that I’ve written to create the k8s_internal logical switch and the NSX HA Edge (aka ESG) took much of the content provided and collapsed it into a single playbook. The NSX playbook I’ve written can be found in the Virtual Elephant GitHub repository for the Infrastructure-as-Code project.

As I’ve stated, this project is mostly about providing me a detailed game plan for learning several new (to me) technologies, including Ansible. The NSX playbook is the first time I’ve used an answer file to obfuscate several of the sensitive variables needed specifically for my environment. The nsxanswer.yml file includes the variable required for connecting to the NSX Manager, which is the component Ansible will be communicating with to create the logical switch and ESG.

Ansible Answer File: nsxanswer.yml (link)

  1 nsxmanager_spec:
  2         raml_file: '/HOMEDIR/nsxraml/nsxvapi.raml'
  3         host: 'usa1-2-nsxv'
  4         user: 'admin'
  5         password: 'PASSWORD'

The nsxvapi.raml file is the API specification file that we cloned in step 1 from the GitHub repository. The path should be modified for your local environment, as should the password: variable line for the NSX Manager.

Ansible Playbook: nsx.yml (link)

  1 ---
  2 - hosts: localhost
  3   connection: local
  4   gather_facts: False
  5   vars_files:
  6     - nsxanswer.yml
  7   vars_prompt:
  8   - name: "vcenter_pass"
  9     prompt: "Enter vCenter password"
 10     private: yes
 11   vars:
 12     vcenter: "usa1-2-vcenter"
 13     datacenter: "Lab-Datacenter"
 14     datastore: "vsanDatastore"
 15     cluster: "Cluster01"
 16     vcenter_user: "administrator@vsphere.local"
 17     switch_name: "{{ switch }}"
 18     uplink_pg: "{{ uplink }}"
 19     ext_ip: "{{ vip }}"
 20     tz: "tzone"
 22   tasks:
 23   - name: NSX Logical Switch creation
 24     nsx_logical_switch:
 25       nsxmanager_spec: "{{ nsxmanager_spec }}"
 26       state: present
 27       transportzone: "{{ tz }}"
 28       name: "{{ switch_name }}"
 29       controlplanemode: "UNICAST_MODE"
 30       description: "Kubernetes Infra-as-Code Tenant Logical Switch"
 31     register: create_logical_switch
 33   - name: Gather MOID for datastore for ESG creation
 34     vcenter_gather_moids:
 35       hostname: "{{ vcenter }}"
 36       username: "{{ vcenter_user }}"
 37       password: "{{ vcenter_pass }}"
 38       datacenter_name: "{{ datacenter }}"
 39       datastore_name: "{{ datastore }}"
 40       validate_certs: False
 41     register: gather_moids_ds
 42     tags: esg_create
 44   - name: Gather MOID for cluster for ESG creation
 45     vcenter_gather_moids:
 46       hostname: "{{ vcenter }}"
 47       username: "{{ vcenter_user }}"
 48       password: "{{ vcenter_pass }}"
 49       datacenter_name: "{{ datacenter }}"
 50       cluster_name: "{{ cluster }}"
 51       validate_certs: False
 52     register: gather_moids_cl
 53     tags: esg_create
 55   - name: Gather MOID for uplink
 56     vcenter_gather_moids:
 57       hostname: "{{ vcenter }}"
 58       username: "{{ vcenter_user}}"
 59       password: "{{ vcenter_pass}}"
 60       datacenter_name: "{{ datacenter }}"
 61       portgroup_name: "{{ uplink_pg }}"
 62       validate_certs: False
 63     register: gather_moids_upl_pg
 64     tags: esg_create
 66   - name: NSX Edge creation
 67     nsx_edge_router:
 68       nsxmanager_spec: "{{ nsxmanager_spec }}"
 69       state: present
 70       name: "{{ switch_name }}-edge"
 71       description: "Kubernetes Infra-as-Code Tenant Edge"
 72       resourcepool_moid: "{{ gather_moids_cl.object_id }}"
 73       datastore_moid: "{{ gather_moids_ds.object_id }}"
 74       datacenter_moid: "{{ gather_moids_cl.datacenter_moid }}"
 75       interfaces:
 76         vnic0: {ip: "{{ ext_ip }}", prefix_len: 26, portgroup_id: "{{ gather_moids_upl_pg.object_id }}", name: 'uplink0', iftype: 'uplink', fence_param: 'ethernet0.filter1.param1=1'}
 77         vnic1: {ip: '', prefix_len: 20, portgroup_id: "{{ switch_name }}", name: 'int0', iftype: 'internal', fence_param: 'ethernet0.filter1.param1=1'}
 78       default_gateway: "{{ gateway }}"
 79       remote_access: 'true'
 80       username: 'admin'
 81       password: "{{ nsx_admin_pass }}"
 82       firewall: 'false'
 83       ha_enabled: 'true'
 84     register: create_esg
 85     tags: esg_create

The playbook expects to be provided three extra variables from the CLI when it is executed — switch, uplink and vip. The switch variable defines the name of the logical switch, the uplink variable defines the uplink VXLAN portgroup the tenant ESG will connect to, and the vip variable is the external VIP to be assigned from the network block. At the time of this writing, these sorts of variables continue to be command-line based, but will likely be moved to a single Ansible answer file as the project matures. Having a single answer file for the entire set of playbooks should simplify the adoption of the Infrastructure-as-Code project into other vSphere environments.

Now that Ansible playbooks exist for creating the NSX components and the VMs for the Kubernetes cluster, the next step will be to begin configuring the software within CoreOS to run Kubernetes.

Stay tuned.

Infrastructure-as-Code: Getting started with Ansible

The series so far has covered the high level design of the project, how to bootstrap CoreOS and understanding how Ignition works to configure a CoreOS node. The next stage of the project will begin to leverage Ansible to fully automate and orchestrate the instantiation of the environment. Ansible will initially be used to deploy the blank VMs and gather the IP addresses and FQDNs of each node created.

Ansible is one of the new technologies that I am using the Infrastructure-as-Code project to learn. My familiarity with Chef was helpful, but I still wanted to get a good primer on Ansible before proceeding. Fortunately, Pluralsight is a great training tool and the Hands-on Ansible course by Aaron Paxon was just the thing to start with. Once I worked through the video series, I dived right into writing the Ansible playbook to deploy the virtual machines for CoreOS to install. I quickly learned there were a few extras I needed on my Ansible control server before it would all function properly.

Step 1: Configure Ansible Control Server

As I stated before, I have deployed an Ubuntu Server 17.10 node within the environment where tftpd-hpa is running for the CoreOS PXEBOOT system. The node is also being leveraged as the Ansible control server (ACS). The ACS node required a few additional packages to be present on the system in order for Ansible to be the latest version and include the VMware modules needed.

To get started, the Ubuntu repositories only include Ansible v2.3.1.0 — which is not from the latest 2.4 branch.

There are several VMware module updates in Ansible 2.4 that I wanted to leverage, so I needed to first update Ansible on the Ubuntu ACS.

$ sudo apt-add-repository ppa:ansible/ansible
$ sudo apt-get update
$ sudo apt-get upgrade

If you have not yet installed Ansible on the local system, run the following command:

$ sudo apt-get install ansible

If you need to upgrade Ansible from the Ubuntu package to the new PPA repository package, run the following command:

$ sudo apt-get upgrade ansible

Now the Ubuntu ACS is running Ansible v2.4.1.0.

In addition to just having Ansible and Python installed, there are additional Python pieces we need in order for all of the VMware Ansible modules to work correctly.

$ sudo apt-get install python-pip
$ sudo pip install --upgrade pyvmomi
$ sudo pip install pysphere
$ sudo pip list | grep pyvmomi

Note: Make sure pyvmomi is running a 6.5.x version to have all the latest code.

The final piece I needed to configure was to include an additional Ansible module to allow for new VM folders to be created. There is a 3rd party module, called vmware_folder, which includes the needed functionality. After cloning the Openshift-ansible-contrib repo, I copied the following vmware_folder.py file into the ACS directory /usr/lib/python2.7/dist-packages/ansible/modules/cloud/vmware.

The file can found on GitHub at the following link.

The Ubuntu ACS node now possesses all of the necessary pieces to get started with the environment deployment.

Step 2: Ansible Playbook for deployment

The starting point for the project is to write the Ansible playbook that will deploy the virtual machines and power them on — thus allowing the PXEBOOT system to download and install CoreOS onto each node. Ansible has several VMware modules that will be leveraged as the project progresses.

The Infrastructure-as-Code project source code is hosted on GitHub and is available for download and use. The project is currently under development and is being written in stages. By the end of the series, the entire instantiation of the environment will be fully automated. As the series progresses, the playbooks will get built out and become more complete.

The main.yml Ansible playbook currently includes two tasks — one for creating the VM folder and a second for deployment of the VMs. It uses a blank VM template that already exists on the vCenter Server.

When the playbook is run from the ACS, it will deploy a dynamic number of nodes, create a new VM folder and allow the user to specify a VM-name prefix.

When the deployment is complete, the VMs will be powered on and booting CoreOS. Depending on the download speeds in the environment, the over/under for the CoreOS nodes to be fully online is roughly 10 minutes right now.

The environment is now deployed and ready for Kubernetes! Next week, the series will focus on using Ansible for installing and configuring Kubernetes on the nodes post-deployment. As always, feel free to reach out to me over Twitter if you have questions or comments.

[Introduction] [Part 1 – Bootstrap CoreOS with Ignition] [Part 2 – Understanding CoreOS Ignition] [Part 3 – Getting started with Ansible]

Infrastructure-as-Code: Project Overview

In an effort to get caught-up with the Cloud Native space, I am embarking on an effort to build a completely dynamic Kubernetes environment entirely through code. To accomplish this, I am using (and learning) several technologies, including:

  • Container OS (CoreOS) for the Kubernetes nodes.
  • Ignition for configuring CoreOS.
  • Ansible for automation and orchestration.
  • Kubernetes
  • VMware NSX for micro-segmention, load balancing and DHCP.

There are a lot of great articles on the Internet around Kubernetes, CoreOS and other Cloud Native technologies. If you are unfamiliar with Kubernetes, I highly encourage you to read the articles written by Hany Michaels (Kubernetes Introduction for VMware Users and Kubernetes in the Enterprise – The Design Guide). These are especially useful if you already have a background in VMware technologies and are just getting started in the Cloud Native space. Mr. Michaels does an excellent job comparing concepts you are already familiar with and aligning them with Kubernetes components.

Moving on, the vision I have for this Infrastructure-as-Code project is to build a Kubernetes cluster leveraging my vSphere lab with the SDDC stack (vSphere, vCenter, vSAN and NSX). I want to codify it in a way that an environment can be stood up or torn down in a matter of minutes without having to interact with any user-interface. I am also hopeful the lessons learned whilst working on this project will be applicable to other cloud native technologies, including Mesos and Cloud Foundry environments.

Logically, the project will create the following within my vSphere lab environment:


I will cover the NSX components in a future post, but essentially each Kubernetes environment will be attached to a HA pair of NSX Edges. The ECMP Edges and Distributed Logical Router are already in place, as they are providing upstream network connectivity for my vSphere lab. The project will focus on the internal network (VXLAN-backed), attached to the NSX HA Edge devices, which will provide the inter-node network connectivity. The NSX Edge is configured to provide firewall, routing and DHCP services to all components inside its network space.

The plan for the project and the blog series is to document every facet of development and execution of the components, with the end goal being the ability of anyone reading the series to understand how all the pieces interrelate with one another. The series will kickoff with the following posts:

  • Bootstrapping CoreOS with Ignition
  • Understanding Ignition files
  • Using Ansible with Ignition
  • Building Kubernetes cluster with Ansible
  • Deploying NSX components using Ansible
  • Deploying full stack using Ansible

If time allows, I may also embark on migrating from NSX-V to NSX-T for providing some of the tenant software-defined networking.

I hope you enjoy the series!

[Introduction] [Part 1 – Bootstrap CoreOS with Ignition] [Part 2 – Understanding CoreOS Ignition] [Part 3 – Getting started with Ansible]