Infrastructure-as-Code: Ansible for VMware NSX

As the project moves into the next phase, Ansible is beginning to be relied upon for the deployment of the individual components that will define the environment. This installment of the series is going to cover the use of Ansible with VMware NSX. VMware has provided a set of Ansible modules for integrating with NSX on GitHub. The modules easily allow the creation of NSX Logical Switches, NSX Distributed Logical Routers, NSX Edge Services Gateways (ESG) and many other components.

The GitHub repository can be found here.

Step 1: Installing Ansible NSX Modules

In order to support the Ansible NSX modules, it was necessary to install several supporting packages on the Ubuntu Ansible Control Server (ACS).

$ sudo apt-get install python-dev libxml2 libxml2-dev libxslt1-dev zlib1g-dev npm
$ sudo pip install nsxramlclient
$ sudo npm install -g
$ sudo npm install -g
$ sudo npm install -g raml-fleece

In addition to the Ansible NSX modules, the ACS server will also require the vSphere for NSX RAML repository. The RAML specification includes information on the NSX for vSphere API. The repo will need to be cloned to a local directory on the ACS as well before execution of an Ansible Playbook will work.

Now that all of the prerequisites are met, the Ansible playbook for creating the NSX components can be written.

Step 2: Ansible Playbook for NSX

The first thing to know is the GitHub repo for the NSX modules include many great examples within the test_*.yml files which were leveraged to create the playbook below. To understand what the Ansible Playbook has been written to create, let’s first review the logical network design for the Infrastructure-as-Code project.


The design calls for three layers of NSX virtual networking to exist — the NSX ECMP Edges, the Distributed Logical Router (DLR) and the Edge Services Gateway (ESG) for the tenant. The Ansible Playbook below assumes the ECMP Edges and DLR already exist. The playbook will focus on creating the HA Edge for the tenant and configuring the component services (SNAT/DNAT, DHCP, routing).

The GitHub repository for the NSX Ansible modules provides many great code examples. The playbook that I’ve written to create the k8s_internal logical switch and the NSX HA Edge (aka ESG) took much of the content provided and collapsed it into a single playbook. The NSX playbook I’ve written can be found in the Virtual Elephant GitHub repository for the Infrastructure-as-Code project.

As I’ve stated, this project is mostly about providing me a detailed game plan for learning several new (to me) technologies, including Ansible. The NSX playbook is the first time I’ve used an answer file to obfuscate several of the sensitive variables needed specifically for my environment. The nsxanswer.yml file includes the variable required for connecting to the NSX Manager, which is the component Ansible will be communicating with to create the logical switch and ESG.

Ansible Answer File: nsxanswer.yml (link)

  1 nsxmanager_spec:
  2         raml_file: '/HOMEDIR/nsxraml/nsxvapi.raml'
  3         host: 'usa1-2-nsxv'
  4         user: 'admin'
  5         password: 'PASSWORD'

The nsxvapi.raml file is the API specification file that we cloned in step 1 from the GitHub repository. The path should be modified for your local environment, as should the password: variable line for the NSX Manager.

Ansible Playbook: nsx.yml (link)

  1 ---
  2 - hosts: localhost
  3   connection: local
  4   gather_facts: False
  5   vars_files:
  6     - nsxanswer.yml
  7   vars_prompt:
  8   - name: "vcenter_pass"
  9     prompt: "Enter vCenter password"
 10     private: yes
 11   vars:
 12     vcenter: "usa1-2-vcenter"
 13     datacenter: "Lab-Datacenter"
 14     datastore: "vsanDatastore"
 15     cluster: "Cluster01"
 16     vcenter_user: "administrator@vsphere.local"
 17     switch_name: "{{ switch }}"
 18     uplink_pg: "{{ uplink }}"
 19     ext_ip: "{{ vip }}"
 20     tz: "tzone"
 22   tasks:
 23   - name: NSX Logical Switch creation
 24     nsx_logical_switch:
 25       nsxmanager_spec: "{{ nsxmanager_spec }}"
 26       state: present
 27       transportzone: "{{ tz }}"
 28       name: "{{ switch_name }}"
 29       controlplanemode: "UNICAST_MODE"
 30       description: "Kubernetes Infra-as-Code Tenant Logical Switch"
 31     register: create_logical_switch
 33   - name: Gather MOID for datastore for ESG creation
 34     vcenter_gather_moids:
 35       hostname: "{{ vcenter }}"
 36       username: "{{ vcenter_user }}"
 37       password: "{{ vcenter_pass }}"
 38       datacenter_name: "{{ datacenter }}"
 39       datastore_name: "{{ datastore }}"
 40       validate_certs: False
 41     register: gather_moids_ds
 42     tags: esg_create
 44   - name: Gather MOID for cluster for ESG creation
 45     vcenter_gather_moids:
 46       hostname: "{{ vcenter }}"
 47       username: "{{ vcenter_user }}"
 48       password: "{{ vcenter_pass }}"
 49       datacenter_name: "{{ datacenter }}"
 50       cluster_name: "{{ cluster }}"
 51       validate_certs: False
 52     register: gather_moids_cl
 53     tags: esg_create
 55   - name: Gather MOID for uplink
 56     vcenter_gather_moids:
 57       hostname: "{{ vcenter }}"
 58       username: "{{ vcenter_user}}"
 59       password: "{{ vcenter_pass}}"
 60       datacenter_name: "{{ datacenter }}"
 61       portgroup_name: "{{ uplink_pg }}"
 62       validate_certs: False
 63     register: gather_moids_upl_pg
 64     tags: esg_create
 66   - name: NSX Edge creation
 67     nsx_edge_router:
 68       nsxmanager_spec: "{{ nsxmanager_spec }}"
 69       state: present
 70       name: "{{ switch_name }}-edge"
 71       description: "Kubernetes Infra-as-Code Tenant Edge"
 72       resourcepool_moid: "{{ gather_moids_cl.object_id }}"
 73       datastore_moid: "{{ gather_moids_ds.object_id }}"
 74       datacenter_moid: "{{ gather_moids_cl.datacenter_moid }}"
 75       interfaces:
 76         vnic0: {ip: "{{ ext_ip }}", prefix_len: 26, portgroup_id: "{{ gather_moids_upl_pg.object_id }}", name: 'uplink0', iftype: 'uplink', fence_param: 'ethernet0.filter1.param1=1'}
 77         vnic1: {ip: '', prefix_len: 20, portgroup_id: "{{ switch_name }}", name: 'int0', iftype: 'internal', fence_param: 'ethernet0.filter1.param1=1'}
 78       default_gateway: "{{ gateway }}"
 79       remote_access: 'true'
 80       username: 'admin'
 81       password: "{{ nsx_admin_pass }}"
 82       firewall: 'false'
 83       ha_enabled: 'true'
 84     register: create_esg
 85     tags: esg_create

The playbook expects to be provided three extra variables from the CLI when it is executed — switch, uplink and vip. The switch variable defines the name of the logical switch, the uplink variable defines the uplink VXLAN portgroup the tenant ESG will connect to, and the vip variable is the external VIP to be assigned from the network block. At the time of this writing, these sorts of variables continue to be command-line based, but will likely be moved to a single Ansible answer file as the project matures. Having a single answer file for the entire set of playbooks should simplify the adoption of the Infrastructure-as-Code project into other vSphere environments.

Now that Ansible playbooks exist for creating the NSX components and the VMs for the Kubernetes cluster, the next step will be to begin configuring the software within CoreOS to run Kubernetes.

Stay tuned.

VMware Integrated OpenStack vs vCloud Director


If you have not already noticed, a lot of my work these days is in the OpenStack space, specifically using VMware Integrated OpenStack. As often happens when working on a new technology or service offering, I get the question

Is OpenStack going to replace VMware vCloud Director as my primary cloud management platform (CMP)?

I remember during TAM-day at VMworld 2014 the announcement for VMware Integrated OpenStack (VIO) and one of the members of the audience asked Carl Eschenbach why VMware was pivoting away from vRealize Automation (vRA) only a year after telling customers that was the go-forward strategy. It begged the question, just which CMP was VMware backing long-term? I don’t have an answer for you. Personally, I think the three CMPs serve different purposes and fit within different customer use-cases — some of which overlap, others don’t.

In order to answer the question, I put together some information comparing the two products to highlight both their similarities and differences. The following is not a complete list, but will highlight the key points for a team or business trying to determine which direction to use for their private cloud.

Comparison Chart

CategoryVMware Integrated OpenStackVMware vCloud Director
ComputeCombined CPU & Memory resourcesCombined CPU & Memory resources
ComputevSphere cluster endpointsvSphere cluster endpoints
vSphere Resource Pool endpoints
Compute2 vCenter Server maximum7+ vCenter Server support
ComputevSphere HA & DRS supportvSphere HA & DRS support
Compute25,000/40,000 tenant-VM maximum
StorageGlobal catalog service (Glance)Global catalog service
StoragePluggable block volumes (Cinder)
VMFS-backed storage
VMFS backed-storage
StoragevSphere Storage Policy enforcementTiered storage through storage policies
StoragevSphere Storage DRS supportvSphere Storage DRS support
StorageObject-based storage (Swift)
NetworkingVLAN-backed portgroup integration
- Single vCenter
NetworkingVMware NSX-v integration
- Single or multi-vCenter
VMware NSX-v integration
NetworkingFull NSX-v functionality
- NSX Edge tenant support.
- NSX DFW for Security Groups
- Load Balancer (as-a-Service)
Limited NSX-v functionality
- NSX Edge tenant support
- No NSX DFW support.
- No NSX Load Balancer support.
NetworkingShared external tenant networks
- No isolation between tenants
Per-tenant, isolated external tenant networks
NetworkingSingle cluster for Edge services for all tenants.Edge/Compute services present on all compute workload cluster.
Cloud ServicesGlobal & Tenant defined VM sizes (flavors)
Cloud ServicesGlobal & Tenant defined VM images.Global catalog for sharing images and vApps.
Cloud ServicesMetering & telemetry functionality for auto-scaling (Ceilometer)
Cloud ServicesApplication stack orchestration (Heat)
- Standard JSON & YAML support.
- Cross-cloud functional.
vApp deployment orchestration
Cloud ServicesStandard API framework
- Industry standard APIs
- Compatible with AWS & S3
Proprietary API framework
ManagementvApp plugin inside vCenter for administrative tasks.
ManagementDistributed Management stack
- Applications clustered within management stack (MongoDB, MariaDB, etc).
- Integrated load balancer for services.
Multiple vCD cell support
- External load balancer required.
- Database is not natively clustered.

Ultimately the answer to the question is going to depend on your use-case and your requirements. I believe either option could work in a majority of the private cloud architectures out there — especially considering how VMware Integrated OpenStack has simplified the deployment and lifecycle management of OpenStack.

Let me know what you think on Twitter! Enjoy.

VMware Storage I/O Control (SIOC) Overview

Data digital flow

The idea for this post has been on the backlog for a really long time. I recently spent a significant amount of time reviewing VMware SIOC — or Storage I/O Control — and became intimately more familiar with its inner-workings. First off, I do not think very much of this information will be new to those who have used, or are using SIOC in a production VMware environment. However, when looking for information on SIOC, I was not able to find a single all-inclusive resource. I am hoping this post will help others understand how the inner-workings of SIOC operate and how to determine if your use-case could be met by enabling SIOC.

Storage I/O Control

VMware first introduced Storage I/O Control, or SIOC, back in vSphere v4.1 and has steadily improved it with nearly every release of vSphere from that time. The latest version of SIOC in vSphere 6.0 includes several enhancements that are helping the adoption rate within enterprise organizations. Storage I/O Control is essentially a disk scheduler that monitors the datastores it is enabled on to determine if there is resource contention occurring. When it detects resource contention, SIOC is able to isolate which VMDK (and therefore VM object) that is causing the contention and take action. This becomes challenging where the datastore is a shared resource across multiple ESXi hosts, clusters or vCenters.

SIOC supports Fibre Channel, ISCSI and NFS datastores. RDM devices are not supported.

In order to use SIOC several prerequisites have to be met:

  • Datastore needs to be isolated to a single vCenter domain.
  • Single extent datastores only.
  • Non-shared spindles for underlying storage array isolated to a single vCenter domain.

A SIOC friendly logical design for the storage layer looks like the following:


The diagram illustrates how a storage array (iSCSI or Fibre Channel) would carve out the storage pools (disk groups) and present LUNs to the vSphere layer that are then mapped to a VMFS datastore. Notice there are no LUNs or datastores being shared across a vCenter domain.

SIOC creates a metadata file on each datastore it is enabled on and that metadata file is used when resource contention occurs to help SIOC identify which VMDK is the culprit. After it identifies the culprit, SIOC will begin limiting the amount of I/O operations can be issued to that datastore. That metadata file is only visible within the SIOC domain it was created on — meaning it can only be seen by the vCenter domain it was created on.

Congestion Thresholds

When SIOC is enabled on a datastore, the vSphere Administrator is given two options to choose from for threshold monitoring — peak throughput or response time.


The two defaults are 90% for peak throughput or 30 milliseconds for response times. It is critical you understand the workload present on the datastore so that these values are set properly. If the threshold values are improperly set, the performance of the virtual machines can be affected in a negative manner. The storage layer SLA requirements of the environment and the physical storage array capabilities will be contributing factors here to how you design the SIOC threshold values.

The peak throughput percentage is calculated by vCenter based on the storage array capabilities. There is a table that has been published with suggestions on what the responses times should be based on the underlying disk types, however it is several years old and your milage may vary. I will forego posting it here, however I will note that the baseline of 30 milliseconds may be too high for some modern storage arrays. For example, the environment I work in now is targeting 15-20 milliseconds as the threshold for response times based on the hardware of the storage arrays and workloads placed on them. Again, understanding your workload is key!

Resource Shares & Limits

If the datastore is not using SIOC, the device resources are divided evenly based on the number of VM objects on the datastore. However, it becomes a first-come, first-served environment. When all of the VM objects are behaving nicely, then everyone gets the same amount of resources allocated to it. However, when one VM object starts to misbehave, there is nothing in place to prevent that VM object from consuming more of his “fair share” of resources. This is commonly referred to as the “noisy neighbor” issue.

When SIOC is enabled on a datastore, based on the resource shares and limits configured, it will prevent the noisy neighbor issue from occurring (theoretically). That does not mean that a single VM could not consume additional resources above his allocated share — it just means that if resource contention is occurring, SIOC will begin to balance the resources equally to all of its VM objects based on their resource shares and limits. Remember, the goal in a storage layer is not to prevent VM objects from having the resources they need — it should be designed to allow them to have what they need without adversely affecting everyone else.

Resource Shares

These work in a similar fashion as resource shares on vCenter Resource Pools. The resource share value is taken, then it is divided by the number of VM objects within that assigned group and distributes the resources evenly across each object. Specifically with SIOC, the number of shares assigned to all the VM objects, within a given ESXi host, will be totaled and then divided by the number of objects.


Beyond resource shares for storage, limits provide a hard upper-bound for storage IO traffic on a virtual machine. The key difference between limits and resources shares is that a limit is enforced on a virtual machine even if the storage array is not currently under contention. Resource shares are only enforced when IO contention is occurring. The default for SIOC limits is for each virtual machine to be unlimited. I suggest being very careful when applying a limit on a virtual machine within an environment.


I think the methodology used by SIOC works really well with shared storage arrays and is one of the key features in vSphere that should be used whenever possible inside a private cloud. One important thing to note is that SIOC does not work with Virtual SAN. Tomorrow’s post will talk about the methodology Virtual SAN uses and talk about the advantages SIOC has over the Virtual SAN IO limiting.


VMware Virtual SAN Hyper-converged Calculator


I have been heavily involved in designing our next-generation, large-scale hyper-converged (HCI) private cloud architecture at work the past couple of months. As part of that design, we needed a way to easily calculate resources available and cluster sizes using VMware Virtual SAN. When determining the resources available and the effects of the new Virtual SAN 6.2 features, the calculations became rather complex pretty quickly. A spreadsheet was born.

The spreadsheet allows a user to input the characteristics of their HCI nodes, and from there the spreadsheet will calculate resources available per node and per cluster size (4 nodes – 64 nodes). The key assistance the spreadsheet provides is the ability to specify a VM unit that can be used to determine how many units per server are necessary to fulfill an architectures requirements. The VM unit should be based off of the workload (known or expected) that will operate within the architecture.

The spreadsheet also allows the user to input the VSAN FTT policies, VSAN reduplication efficiency factor and memory overcommitment factors — all in an effort to help the user best determine what cluster sizes should be used and how different server configurations effect the calculations.

A few key cells that should be modified by the user initially:

  • B2-B5 – HCI node CPU characteristics
  • B10 – HCI node Memory characteristic
  • B15-16,B18-19 – HCI node VSAN disk configuration
  • B22-28 – Expected/desired VSAN and cluster efficiencies. A value of 1.0 for any efficiency factor means is the baseline.

From there, the remaining cells will be updated and provide a HCI summary node box (highlighted in Yellow) and cluster nodes sizes. The user can then see what the different configurations will yield with a VSAN RAID-1, VSAN RAID-5 and VSAN RAID-6 configuration based on the values inputted in the spreadsheet.

The spreadsheet takes into consideration the number of VSAN disk groups, the ESXi system overhead for memory and CPU, and the overhead VSAN 6.2 introduces as well.

All-in-all, this has proven to be a good tool for our team as we’ve been working on our new HCI design and hopefully will be a useful tool for others as well.

The spreadsheet can be downloaded here.

VMware BDE Template Snapshot

The VMware BDE template uses a snapshot to perform the cloning operation as it deploys a cluster. The ability to create a cloned VM from a snapshot is exposed in the vSphere API with the CloneVM_Task. As part of regular template maintenance, I run a yum update command to make sure the OS gets regular updates and security patches. It helps when installing packages like Docker to make sure I’m as close to the stable CentOS 7 branch as possible. However, if you were to simply power on the template and run an OS update those changes would not be realized in new cluster deployments.

If you look at your BDE template, the snapshot the Management server uses can be seen.


By deleting the snapshot, any changes you have made to the template will be used during future cluster deployments. It is not necessary to do anything else. The next cluster deployment, if the template is missing, the BDE framework will create a new one and proceed to use it.

The ability to update the BDE template will assist you in the lifecycle management of your Hadoop, Apache Mesos and all other cluster deployments you are using the VMware Big Data Extensions framework for. Enjoy!