NSX DLR Designated Instance

nsx designated instance

While a great show, we are going to talk about something slightly different — the NSX Distributed Logical Router (DLR) Designated Instance. NSX has many great features and also many caveats when implementing some of those great features — like having a Designated Instance when using a DLR.

So what is a Designated Instance? Honestly, I did not know what it was until a conversation earlier today with a few co-workers who are a bit more knowledgable with NSX than me. Essentially a Designated Instance is an elected ESXi host that will answer all new requests initially — also known as a single-point of failure.

Let’s look at the logical network diagram I posted yesterday.

nsx-dlr-openstack

Pretty sweet right?

The issue is when the DLR is connected directly to a VLAN. While technically not a problem — it does exactly what you’d expect it does — it results in having to have one of the ESXi hosts in the transport zone act as the Designated Instance. The result is that if the Designated Instance ESXi host encounters a failure, any new traffic will fail until the election process is complete and a new Designated Instance is chosen.

So is it possible to not need a Designated Instance when using a DLR? Yes.

It involves introducing another logical NSX layer into the virtual network design. If you saw my tweet earlier, this is what I meant.

I like , but sometimes I think it adds a little too much complexity for operational simplicity.

Adding a set of ECMP edges above the DLR and connecting the two together will eliminate the requirement for NSX to use a the Designated Instance. Here is what an alternative to the previous design would look like.

external openstack

Essentially what I’ve done is create another VXLAN, with a corresponding NSX Logical Switch and connect the uplink from the DLR to it. Then the ECMP Edges use the same Logical Switch as their internal interface. It is on the uplink side of the ECMP Edge where the P2V layer takes place and the VLAN is connected.

Using this design allows the environment to use a dynamic routing protocol between both the DLR and ECMP Edges and ECMP Edges and the upstream physical network — although mileage may vary depending on your physical network. The ECMP Edges introduce additional scalability — although limited to 8 — based on the amount of North-South network traffic and the bandwidth required to meet the tenant needs. Features like vSphere Anti-Affinity rules can mitigate a failure of a single ESXi host, which you cannot do when there is a Designated Instance. The design can also take into consideration a N+x scenario for when to scale the ECMP Edges.

So many options open up when NSX is introduced into an architecture, along with a lot of extra complexity. Ultimately the decision should be based on the requirements and the stakeholders risk acceptance. Relying on a Designated Instance may be acceptable to a stakeholder, while adding more complexity to the design may not be.

Until next time, enjoy!

Multi-tenant OpenStack with NSX – Part 3

design-sla-banner

This next post in the series about multi-tenant OpenStack with NSX will discuss the use of a Distributed Logical Router as the bridge between OpenStack and the physical network. If you have not read the previous posts in the series, you can catch up by reading this one.

Originally the plan had been to segment each OpenStack tenant off with their own HA-pair of NSX Edges. However, after discovering that OpenStack does not honor the tenant-id parameter, nor the disabling the Shared parameter within the external network object and adjustment had to be made. Working through the problem it became clear that a NSX Distributed Logical Router (DLR) could be leveraged and would also scale as the environment grows beyond a few dozen tenants. The new multi-tenant network design for OpenStack now looks like this:

nsx-dlr-openstack

The logical diagram shows how the uplink of the DLR is the upstream (north-south) boundary for the environment. The internal interface on the DLR is the external OpenStack network, and is leveraging VXLAN to provide the floating IP addresses the OpenStack tenants will consume. If you are unfamiliar with how a DLR operates, I recommend reading this post from Roie Ben Haim on his blog.

Basically, the DLR relies upon two components — the control VMs and a kernel module inside vSphere to inject routing on each ESXi host within the NSX transport zone. It is the inclusion of this kernel module on every ESXi host that will allow this virtual network design to scale as the environment does. In the previous design, the individual NSX Edges would merely be deployed in an HA-pair and only the active VM would be handling all of the traffic for that particular tenant. With the DLR, although all tenant traffic will be going through the single layer, that layer is distributed across the entire workload environment.

After the DLR is created, the routes can be see within each ESXi host, as shown in the following image.

ext_openstack_dlr19

The downside that still remains is the shared pool of IP addresses for all tenants to consume from. Operationally it will mean having to manage the tenant quotas for floating IP addresses and making sure there is no over-allocation. I would still like to see the OpenStack community take on the extra work of honoring the tenant-id parameter when creating an external network within OpenStack so that the option would exist to have individual tenant floating IP address pools.

Tuesday’s post will include detailed instructions on how to deploy and configure both the NSX portion of this deployment and the OpenStack pieces to tie it all together. Enjoy!

Multi-tenant OpenStack with NSX – Part 2

multi-tenancy-network

The post yesterday discussed a method for having segmented multi-tenant networks inside of OpenStack. As a series of test cases were worked through with a setup of this nature, a large gaping hole in OpenStack came into view.

What does the previously described multiple external networks look like inside OpenStack?

admin-external-networks
Admin view with multiple external networks defined in OpenStack.
tenant1-external-networks
Networks from Tenant 1 view.
tenant2-external-networks
Networks from Tenant 2 view.

In the second and third screenshots, you can see the two tenants see both external networks, but they only see a subnet listed for the external network that was created with their respective tenant-id. At first glance, this would seem to be doing what was intended — each tenant receiving their own external network to consume floating IP addresses from. Unfortunately, it begins to breakdown when a tenant goes to Compute –> Access & Security –> Floating IPs in Horizon.

floating-ips
Multiple tenant floating IP addresses.

The above screenshot shows a tenant being assigned an floating IP address from what should have been an external network they did not have access to.
facepalm

I felt pretty much like Captain Picard after working through the test cases. Surely, OpenStack would allow a design where tenants have segmented external networks — right?

Unfortunately, OpenStack does not honor this type of segmented external networking design — it will allow any tenant consume/claim a floating IP address from any of the other external networks. To read how OpenStack fully implements external networks, you can read the documentation here. At issue here is highlighted here,

 

 

Nevertheless, the concept of ‘external’ implies some forms of sharing, and this has some bearing on the topologies that can be achieved. For instance it is not possible at the moment to have an external network which is reserved to a specific tenant.

Essentially, OpenStack Neutron thinks of external networks differently, than I believe most architects. It also does not clearly honor the tenant-id attribute that is specified when the network is created, nor when the shared attribute is not enabled on the external network. The methodology OpenStack Neutron uses is more in-line with the AWS consumption model — everyone drinks from the same pool and there is no segmentation between the tenants. I personally do not believe that model works in a private cloud where there are multiple tenants.

The next post in the series will discuss a potential design for working around the issue inside OpenStack Neutron.

Multi-tenant OpenStack with NSX – Part 1

openstack-nsx

I have been working on an OpenStack architecture design using VMware Integrated OpenStack (VIO) for the past several months. The design itself is being developed for an internal cloud service offer and is the design for my VCDX certification pursuit in 2017. As the design has gone through the Proof-of-Concept and later the Pilot phases, determining how to offer a multi-tenant/personal cloud offering has presented itself to be challenging. The design relies heavily on the NSX software defined networking (SDN) platform, both because of requirements from VIO and service offering requirements. As such, all north-south traffic goes through a set of NSX Edge devices. Prior to hitting the north-south boundary however, the east-west tenant traffic needed to be addressed.

I had seen other designs where a single, large (/22 or greater) external network was relied upon by all OpenStack tenants talked about. However, for a personal cloud or multi-tenant cloud offering based on OpenStack, I felt this was not a good design choice for this environment. I wanted to have a higher-level of separation between tenants and one option involved a secondary pair of NSX Edge devices for each tenant. The following diagram describes the logical approach.

OpenStack NSX

The above logical representation expresses how the deployed tenant NSX Edges are connected upstream to the security or L3 network boundary and downstream to the OpenStack Project that is created for each tenant. At a small to medium scale, I believe this model works operationally — the tenant NSX Edges create a logical separation between each OpenStack tenant and (if assigned to on a per-team basis) should remain relatively manageable as the environment scales to dozens of tenants.

The NSX Edge devices are deployed inside the very same OpenStack Edge cluster specified during the VIO deployment, however they are being managed outside of OpenStack and the tenant has no control over them. Each secondary pair of NSX Edge devices for the tenant are configured with two interfaces — one uplink to the north-south NSX Edge security gateway and a single interface that becomes the external network inside OpenStack. It is this internal interface that the OpenStack tenant deploys all of their own micro-segmented networks and consumes floating IP addresses from.

An upcoming post will describe the deployment and configuration of these tenant NSX Edge devices and their corresponding configuration inside OpenStack.

I am concerned with this approach once the scale of the environment grows to 50+ OpenStack tenants. It is at that point where operational management difficulties will begin to surface and cause challenges. Specifically, the lack of available automation for creating, linking and configuring both the NSX component deployment and tenant creation within OpenStack itself (projects, users, role-based access, external networks, subnets, etc).


Update

After writing the initial article in the post, further testing was performed that has influenced how multi-tenant OpenStack external networks can be implemented. As such, Part 2 of the series includes the new information. For completeness, I’ve decided to include that information here as well for archival purposes.

Part 2

The post yesterday discussed a method for having segmented multi-tenant networks inside of OpenStack. As a series of test cases were worked through with a setup of this nature, a large gaping hole in OpenStack came into view.

What does the previously described multiple external networks look like inside OpenStack?

admin-external-networks
Admin view with multiple external networks defined in OpenStack.
tenant1-external-networks
Networks from Tenant 1 view.
tenant2-external-networks
Networks from Tenant 2 view.

In the second and third screenshots, you can see the two tenants see both external networks, but they only see a subnet listed for the external network that was created with their respective tenant-id. At first glance, this would seem to be doing what was intended — each tenant receiving their own external network to consume floating IP addresses from. Unfortunately, it begins to breakdown when a tenant goes to Compute –> Access & Security –> Floating IPs in Horizon.

floating-ips
Multiple tenant floating IP addresses.

The above screenshot shows a tenant being assigned an floating IP address from what should have been an external network they did not have access to.
facepalm

I felt pretty much like Captain Picard after working through the test cases. Surely, OpenStack would allow a design where tenants have segmented external networks — right?

Unfortunately, OpenStack does not honor this type of segmented external networking design — it will allow any tenant consume/claim a floating IP address from any of the other external networks. To read how OpenStack fully implements external networks, you can read the documentation here. At issue here is highlighted here,

 

 

Nevertheless, the concept of ‘external’ implies some forms of sharing, and this has some bearing on the topologies that can be achieved. For instance it is not possible at the moment to have an external network which is reserved to a specific tenant.

Essentially, OpenStack Neutron thinks of external networks differently, than I believe most architects. It also does not clearly honor the tenant-id attribute that is specified when the network is created, nor when the shared attribute is not enabled on the external network. The methodology OpenStack Neutron uses is more in-line with the AWS consumption model — everyone drinks from the same pool and there is no segmentation between the tenants. I personally do not believe that model works in a private cloud where there are multiple tenants.

The next post in the series will discuss a potential design for working around the issue inside OpenStack Neutron.

VMware Integrated OpenStack – Collapse Compute & Edge Clusters

the_openstack_logo-svg

VMware Integrated OpenStack (VIO) introduced the ability to deploy to multiple vCenter Servers with version 2.5. The feature allowed the OpenStack management VMs to be deployed inside a control plane vCenter, while allowing the data plane to use a separate vCenter server. The architecture model still required three clusters:

  • Management Cluster (Management vCenter Server)
  • Compute Cluster(s) (Workload vCenter Server)
  • Edge Cluster (Workload vCenter Server)

The three cluster architecture follows the published best practices from both VIO and NSX. Having a dedicated Edge cluster should free up tenant resources and prevent potential issues with network noisy-neighbors. However, having a dedicated cluster just for NSX Edge VMs could be overkill in some environments from both a cost and compute perspective. If you are also using Virtual SAN to leverage hyper-converged infrastructure (HCI), the cost increases considerably with licensing costs for vSphere, NSX and Virtual SAN for hosts that will be extremely under-utilized.

So how can you collapse the compute and edge clusters in a VMware Integrated OpenStack environment?

In version 3.0 there is a configuration change that makes it possible to collapse these two cluster. Performing the following steps will allow you to deploy a smaller footprint OpenStack deployment using VIO.

$ sudo vim /opt/vmware/vio/etc/omjs.properties

Add the following lines to the end of the configuration file:

## Collapse the Edge/Compute clusters
oms.allow_shared_edge_cluster = true

Restart the OMS services
$ sudo restart oms

Once the OMS services have been restarted, the VIO Deployment UI will now allow you to deploy the Edge VMs inside the same Compute cluster on the control plane vCenter Server instance.

A couple caveats to this approach to be aware of:

  • All tenant deployed Edge VMs will live in the collapsed Edge/Compute cluster. As the environment scales to include multiple compute clusters, only this initial Edge/Compute cluster will have the Edge VMs.
  • The OpenStack Horizon UI is unaware of these tenant deployed Edge VMs, so reporting on utilization within the compute cluster is shown on the screen, the rates will have discrepancies — depending on how large the environment is.

Your mileage may vary, but this option allows for some additional flexibility when deploying VMware Integrated OpenStack.