NSX DLR Designated Instance

nsx designated instance

While a great show, we are going to talk about something slightly different — the NSX Distributed Logical Router (DLR) Designated Instance. NSX has many great features and also many caveats when implementing some of those great features — like having a Designated Instance when using a DLR.

So what is a Designated Instance? Honestly, I did not know what it was until a conversation earlier today with a few co-workers who are a bit more knowledgable with NSX than me. Essentially a Designated Instance is an elected ESXi host that will answer all new requests initially — also known as a single-point of failure.

Let’s look at the logical network diagram I posted yesterday.


Pretty sweet right?

The issue is when the DLR is connected directly to a VLAN. While technically not a problem — it does exactly what you’d expect it does — it results in having to have one of the ESXi hosts in the transport zone act as the Designated Instance. The result is that if the Designated Instance ESXi host encounters a failure, any new traffic will fail until the election process is complete and a new Designated Instance is chosen.

So is it possible to not need a Designated Instance when using a DLR? Yes.

It involves introducing another logical NSX layer into the virtual network design. If you saw my tweet earlier, this is what I meant.

I like , but sometimes I think it adds a little too much complexity for operational simplicity.

Adding a set of ECMP edges above the DLR and connecting the two together will eliminate the requirement for NSX to use a the Designated Instance. Here is what an alternative to the previous design would look like.

external openstack

Essentially what I’ve done is create another VXLAN, with a corresponding NSX Logical Switch and connect the uplink from the DLR to it. Then the ECMP Edges use the same Logical Switch as their internal interface. It is on the uplink side of the ECMP Edge where the P2V layer takes place and the VLAN is connected.

Using this design allows the environment to use a dynamic routing protocol between both the DLR and ECMP Edges and ECMP Edges and the upstream physical network — although mileage may vary depending on your physical network. The ECMP Edges introduce additional scalability — although limited to 8 — based on the amount of North-South network traffic and the bandwidth required to meet the tenant needs. Features like vSphere Anti-Affinity rules can mitigate a failure of a single ESXi host, which you cannot do when there is a Designated Instance. The design can also take into consideration a N+x scenario for when to scale the ECMP Edges.

So many options open up when NSX is introduced into an architecture, along with a lot of extra complexity. Ultimately the decision should be based on the requirements and the stakeholders risk acceptance. Relying on a Designated Instance may be acceptable to a stakeholder, while adding more complexity to the design may not be.

Until next time, enjoy!

Understanding a Design Decision


The last couple of months leading into the end of the year has seen me focusing once again on earning the VCDX certification. In the process of doing a fair amount of examination of my skills, especially my areas of weakness, I knew a new design was needed. Fortunately a new project at work had me focusing on building an entirely new VMware Integrated OpenStack service offering. Being able to work on the design from inception to POC to Pilot has provided me a great learning opportunity. One of my weaknesses has been to be sure I understand the ramifications of each design decision being made in the architecture. As I worked through the process of documenting all of the design decisions, I settled on a template within the document.

The following table is replicated for each design decision within the architecture.


One of the ways I worked to improve my understanding of how to document a proper design was the book, IT Architect: Foundation in the Art of Infrastructure Design. In the book I noticed the authors made sure to highlight the design justifications throughout every chapter. I wanted to incorporate that same justifications within my VCDX architecture document and be sure to document the other risks, impacts and also the requirements that were achieved by the decision.

In the design I am currently working on, an example of the above table in action can be found in the following image.


Here a decision for the compute platform was made to use the Dell PowerEdge R630 server. Requirements like the SLA had to also be taken into consideration, which you see in the risks and risk mitigation. The table helps to highlight when some design decisions actually add in additional requirements for the architecture — usually found in the Impact or Decision Risks section of the table. In the case of the example, the table notes,

Dell hardware has been prone to failures, includes drives, SD cards and controller failures.

I documented the risk based on knowledge acquired over nearly a decade of using Dell hardware, especially most recently in my current role. Based on that knowledge, I documented it as a risk which would need to be addressed — which created an ancilliary requirement needing to be addressed. The subsequent Risk Mitigation fulfills the new requirement.

A 4-hour support contract is purchased for each compute node. In addition, an on-site hardware locker is maintained at the local data center, which contains common components to reduce the mean-time-to-resolution when a failure occurs.

The subsequent decision to purchase a 4-hour support contract from Dell for issues, combined with the on-site hardware locker, allow the design to account for the SLA requirements of the service offering while also solving a known risk — hardware failure. In my previous VCDX attempt, I did not do a good enough job working through this thought process and is a key reason why I was not successful.

The process of documenting the table has helped me make sure the proper amount time is spent thinking through every decision. I am also finding documenting all the decisions to be helpful as I review the design with others. All-in-all it has been a great process to work through and is helping me to be sure to know and comprehend every aspect of the design.

As noted previously, I am still pursuing my VCDX certification right now and so these opinions may not be shared by those who have already earned their VCDX certifications.

OpenStack Alert Definitions in vRealize Operations


The previous post discussed the use of the vRealize Operations Management Pack for OpenStack and Endpoint Agent in order to provide detailed service-level monitoring within an environment. The management pack comes with nearly 200 pre-defined alerts for OpenStack that can be leveraged to understand what is occurring within the environment. As I’ve gone through the alerts, these are the key alerts that can be leveraged to understand when any of the OpenStack services are experiencing a partial or complete outage.

OpenStack Compute Alerts

ServiceAlert NameTriggers
NovaAll nova-network services are unavailableAll nova-network services are unavailable
NovaAll nova-xvpnc-proxy services are unavailableAll nova-xvpnc-proxy services are unavailable
NovaAll nova-scheduler services are unavailableAll nova-scheduler services are unavailable
NovaAll nova-api services are unavailableAll nova-api services are unavailable
NovaAll nova-consoleauth services are unavailableAll nova-consoleauth services are unavailable
NovaAll nova-cert services are unavailableAll nova-cert services are unavailable
NovaAll nova-compute services are unavailableAll nova-compute services are unavailable
NovaAll nova-conductor services are unavailableAll nova-conductor services are unavailable
NovaAll nova-console services are unavailableAll nova-console services are unavailable
NovaAll nova-novncproxy services are unavailableAll nova-novncproxy services are unavailable
NovaAll nova-objectstore services are unavailableAll nova-objectstore services are unavailable
NovaThe nova-compute service is unavailableNova-compute status is unknown
NovaThe nova-objectstore service is unavailableNova-objectstore status is unknown
NovaThe nova-conductor service is unavailableNova-conductor status is unknown
NovaThe nova-api service is unavailableNova-api status is unknown
NovaThe nova-cert service is unavailableNova-cert status is unknown
NovaThe nova-console service is unavailableNova-console status is unknown
NovaThe nova-consoleauth service is unavailableNova-consoleauth status is unknown
NovaThe nova-network service is unavailableNova-network status is unknown
NovaThe nova-novnc-proxy service is unavailableNova-novncproxy status is unknown
NovaThe nova-schedulerNova-scheduler status is unknown
NovaThe nova-xvpvnc-proxy service is unavailableNova-xvpvnc-proxy status is unknown

OpenStack Storage Alerts

ServiceAlert NameTriggers
GlanceAll glance-api services are unavailableAll glance-api services are unavailable
GlanceAll glance-registry services are unavailableAll glance-registry services are unavailable
GlanceThe glance-api service is unavailableGlance-api status is unknown
GlanceThe glance-registry service is unavailableGlance-registry status is unknown
CinderAll cinder-api services are unavailableAll cinder-api services are unavailable
CinderAll cinder-scheduler services are unavailableAll cinder-scheduler services are unavailable
CinderAll cinder-volume services are unavailableAll cinder-volume services are unavailable
CinderThe cinder-volume service is unavailableCinder-volume status is unknown
CinderThe cinder-api service is unavailableCinder-api status is unknown
CinderThe cinder-scheduler service is unavailableCinder-scheduler status is unknown

OpenStack Network Alerts

ServiceAlert NameTriggers
NeutronThe neutron-lbaas-agent service is unavailableNeutron-lbaas-agent status is unknown
NeutronThe neutron-server service is unavailableNeutron-server status is unknown
NeutronAll neutron-dhcp-agent services are unavailableAll neutron-dhcp-agent services are unavailable
NeutronAll neutron-l3-agent services are unavailableAll neutron-l3-agent services are unavailable
NeutronAll neutron-lbaas-agent services are unavailableAll neutron-lbaas-agent services are unavailable
NeutronAll neutron-metadata-agent services are unavailableAll neutron-metadata-agent services are unavailable
NeutronAll neutron-server services are unavailableAll neutron-server services are unavailable
NeutronThe neutron-dhcp-agent service is unavailableNeutron-dhcp-agent status is unknown
NeutronThe neutron-l3-agent service is unavailableNeutron-l3-agent status is unknown
NeutronThe neutron-lbaas-agent service is unavailableNeutron-lbaas-agent status is unknown
NeutronThe neutron-metadata-agent service is unavailableNeutron-metadata-agent status is unknown
NeutronThe neutron-server service is unavailableNeutron-server status is unknown

OpenStack Auxiliary Alerts

ServiceAlert NameTriggers
HeatAll heat-api services are unavailableAll heat-api services are unavailable
HeatAll heat-api-cfn services are unavailableAll heat-api-cfn services are unavailable
HeatAll heat-api-cloudwatch services are unavailableAll heat-api-cloudwatch services are unavailable
HeatAll heat-engine services are unavailableAll heat-engine services are unavailable
HeatThe heat-api service is unavailableHeat-api status is unknown
HeatThe heat-api-cfn service is unavailableHeat-api-cfn status is unknown
HeatThe heat-api-cloudwatch status is unavailableHeat-api-cloudwatch status is unknown
HeatThe heat-engine service is unavailableHeat-engine service is unknown
KeystoneAll keystone-all services are unavailableAll keystone-all services are unavailable
KeystoneThe keystone-all service is unavailableKeystone-al service is unknown
MySQLAll MySQL services are unavailableAll MySQL services are unavailable
MySQLThe MySQL Database service is unavailableMySQL status is unknown
ApacheAll Apache services are unavailableAll Apache services are unavailable
ApacheThe Apache service is unavailableApache status is unknown
JarvisAll Jarvis services are unavailableAll Jarvis services are unavailable
MemcachedAll Memcached services are unavailableAll Memcached services are unavailable
MemcachedThe memcached service is unavailableMemcached status is unknown
RabbitMQAll RabbitMQ services are unavailableAll RabbitMQ services are unavailable
RabbitMQThe Rabbit Messaging service is unavailableRabbit Message Queue status is unknown
OMSAll tc-oms services are unavailableAll tc-oms services are unavailable
OMSAll tc-osvmw services are unavailableAll tc-osvmw services are unavailable
vPostGresAll vPostGres services are unavailableAll vPostGres services are unavailable
vPostGresThe vpostgres service is unavailableVpostgres status is unknown
CeilometerThe ceilometer-agent-central service is unavailableCeilometer-agent-central status is unknown
CeilometerThe ceilometer-agent-compute service is unavailableCeilometer-agent-compute status is unknown
CeilometerThe ceilometer-agent-notification service is unavailableCeilometer-agent-notification status is unknown
CeilometerThe ceilometer-alarm-evaluator service is unavailableCeilometer-alarm-evaluator status is unknown
CeilometerThe ceilometer-alarm-notifier service is unavailableCeilometer-alarm-notifier status is unknown
CeilometerThe ceilometer-api service is unavailableCeilometer-api status is unknown
CeilometerThe ceilometer-collector service is unavailableCeilometer-collector status is unknown

Use of these alerts will help the environment be ready for a production deployment where an SLA can be attached. Enjoy!

Designing for a SLA Metric

twitter-post-slaOver the weekend I focused on two things — taking care of my six kids while my wife was out of town and documenting my VCDX design. During the course of working through the Monitoring portion of the design I found myself focusing on the technical reasons for some of the design decisions I was making to meet the SLA requirements of the design. That prompted the tweet you see the the left. When working on any design, you have to understand where the goal posts are in order to make intelligent decisions. With regards to an SLA, it means understanding what the SLA target is and on what frequency the SLA is being calculated. As you can see from the image, a SLA calculated against a daily metric will vary a considerable amount from a SLA calculated on a weekly or monthly basis.

So what can be done to meet the target SLA? If the monitoring solution is inside the environment, shouldn’t it have a higher target SLA than the thing it is monitoring? As I looked at the downtime numbers, I realized there were places where vSphere HA would not be adequate (by itself) to meet the SLA requirement of the design if it was being calculated on a daily or weekly basis. The ever elusive 99.99% SLA target eliminates vSphere HA altogether if it is being calculated on any less than a yearly basis.

As the architect of a project it is important to discuss the SLA requirements with the stakeholders and understand where the goal posts are. Otherwise you are designing in the vacuum of space with no GPS to guide you to the target.

SLAs within SLAs

The design I am currently working on had requirements for a central log repository and a SLA target of 99.9% for the tenant workload domain, calculated on a monthly basis. As I worked through the design decisions, I came to realize however the central logging capability that vRealize Log Insight is providing to the environment should be more resilient than the 99.9% uptime of the workload domain it is supporting. This type of SLA within a SLA is the sort of thing you may find yourself having to design against. So how could I increase the uptime to be able to support a higher target SLA for Log Insight?

The post on Friday discussed the clustering capabilities of Log Insight and that came about as I was working through this problem. If the clustering capability of Log Insight could be leveraged to increase the uptime of the solution, even on physical infrastructure only designed to provide a lower 99.9% SLA, then I could meet the higher target sub-SLA. By including a 3-node Log Insight cluster and creating anti-affinity rules on the vSphere cluster to ensure the Log Insight virtual appliances were never located on the same physical node, I was able to increase the SLA potential of the solution. The last piece of the puzzle was the incorporation of the internal load balancing mechanism of Log Insight and using the VIP as the target for all of the systems remote logging functionality. This allowed me to create a central logging repository with a higher target SLA than the underlying infrastructure SLA.

Designing for and justifying the decisions made to support a SLA is one of the more trying issues in any architecture, at least in my mind. Understanding how decisions made influence positively or negatively the SLA goals of the design is something every architect will need to do. This is one area where I was weak during my previous VCDX defense and as not able to accurately articulate. After spending significant time thinking through the key points of my current design, I have definitely learned more and have been able to understand what effects the choices I am making have.

The opinions expressed above are my own and as I have not yet acquired my VCDX certification, these opinions may not be shared by those who have.


Building a Log Insight Cluster


Finding a post for today’s #vDM30in30 post was a challenge. When I set out to complete the challenge I knew the later posts would become more difficult as the weeks wore on, but I didn’t think the challenge would arise so quickly (i.e. the end of week 2). For whatever reason, I could not decide on a topic that I wanted to write about until late this evening. As I was working on the portion of my VCDX design that covers Monitoring and the supporting infrastructure, I found myself thinking about how to incorporate a proper vRealize Log Insight system into the design. That led to tonight’s topic, Log Insight clusters.

I have learned a VCDX design should never include a VMware product just for the sake of including it. The need for vRealize Log Insight in the current design I am working on is justified by the requirements. As I have learned to use Log Insight more extensively over the past year and a half, the strengths of the product continue to amaze me. One such strength is the ease with which it is possible to incorporate a high availability feature into the platform. If you are unfamiliar with vRealize Log Insight, it is an analytics and remote logging platform that acts as a remote syslog server capable of parsing hundreds of thousands of log messages per day. The regular expression capabilities of the product are second-to-none — much better and more reliable than similar products like Splunk (IMHO).

The design I am working on is leveraging VMware Cloud Foundation (VCF) as the hardware and SDDC platform. With this requirement comes certain constraints, including the deployment method VCF uses for vRealize Log Insight. When VCF creates the management domain, it deploys a single vRealize Log Insight virtual appliance. Because I have a requirement to store all relevant log files in a central location, leveraging the existing vRealize Log Insight virtual appliance makes sense. However a single node is a single point of failure, which is not adequate for a production architecture, let alone a VCDX design.

So how can vRealize Log Insight be enhanced to handle a failure? Why a cluster of course! The Engineering team responsible for vRealize Log Insight were kind enough to build a clustering feature into the product and even included an internal load balancer as well! Having a cluster of nodes allows the environment to handle an eventual failure event — whether it is because the VM operating system becomes unresponsive or the underlying ESXi node fails altogether. Once configured, the VIP specified for use by the internal load balancer should be the IP and/or FQDN all of the downstream services use for sending syslog messages.

Configure a Log Insight Cluster

The creation of a Log Insight cluster is relatively straightforward and I will quickly go through the steps. Remember the Log Insight nodes have a requirement to exist on the same L2 network — no L3 support for multiple geographic clusters currently. Simply deploy three Log Insight virtual appliances and power them on. Once the OS has been started, log into the web UI for the additional instances and perform the following steps.

Select Next to proceed with the configuration on the new node.
Select the Join Existing Deployment option.
Enter the FQDN of the existing master Log Insight node and click Go.
Once joined, select the hyperlink to take you to the master node UI.
Log in using the Administrator credentials for Log Insight.
Select Allow to join the new node to the cluster.
Configure a Virtual IP address for the Integrated Load Balancer.

Add a third node in and you have a working vRealize Log Insight cluster, capable of distributing incoming log messages between multiple nodes. Depending on the SLA for the environment, you can increase the number of nodes within the cluster to meet the requirements.

Fortunately for me, the weekend posts were written on election night and are scheduled to auto-publish. Hopefully that will allow me to spend some much needed time working on VCDX design documentation. The December 1 deadline is fast approaching!