In a recent conversation, it became clear to me that my knowledge of the inner workings of VXLAN and VSAN were not a deep as they could be. Since I am also studying for my VCAP exams, I knew additional time educating myself around these two technologies was a necessity. As a result, I’ve spent the last day diving into the IGMP protocol, multicast traffic and how they are utilized both within VXLAN and VMware VSAN. I wanted to capture what I’ve learned on a blog post as much for myself as for anyone else who might be interested in the subject. Writing what I’ve learned is one way I can absorb and retain information long-term.

IGMP

IGMP is a layer 3 network protocol. It is a communications protocol use to establish multicast group memberships. It is encapsulated within an IP packet and does not use a transport layer — similar to ICMP. It is also used to register a router for receiving multicast traffic. There are two important pieces within the IGMP protocol that VXLAN and VSAN take advantage of — IGMP Querier and IGMP Snooping. Without these two pieces, IGMP would act astonishing more than a broadcast transmission and lack the efficiency required.

IGMP Querier

The IGMP Querier is the router or switch that acts as the master for the IGMP filter lists. It will check and track membership by sending queries on a timed interval.

IGMP Snooping

On a layer 2 switch, IGMP Snooping allows for the passive monitoring for IGMP packets sent between router(s) and host(s). It also does not send any additional network traffic across the wire, making it more efficient for multicast traffic passing through the network.

VXLAN

That’s IGMP in a nutshell — so how is it used in VXLAN?

In order for VXLAN to act as an overlay network, multicast traffic is used to enable the L2 over L3 connectivity — effectively spanning the entire logical network VXLAN has defined. When a virtual machine connects to a VXLAN logical network, it behaves as though everything is within a single broadcast domain. The ESXi hosts, configured for VXLAN, register themselves as VTEPs. Only those VTEPs that register with the VXLAN logical network participate in the multicast broadcasts. This is accomplished through IGMP Snooping and IGMP Querier. If you have 1000 ESXi hosts configured for VXLAN, but only a subset (say 100) of the hosts are concerned for a specific VXLAN logical network, you wouldn’t want to send multicast broadcasts out to all 1000 ESXi hosts — that would be inefficient by increasing the multicast traffic on the network unnecessarily.

There is a really good VMware Blog 4-part series on VXLAN and how it operates here.

VMware VSAN

The implementation for VSAN is very similar to that of VXLAN. The VSAN clusters require a methodology for learning what ESXi hosts are adjacent to each other and participating as a VSAN cluster. VMware uses layer 2 multicast traffic for the host discovery within VSAN.

Once again, IGMP Querier and IGMP Snooping are play a beneficial role. VMware states that implementing multicast flooding is not a best practice. By leveraging both IGMP Snooping and IGMP Querier, VSAN is able to understand who wants to participate within the multicast group. This is particularly beneficial when multiple network devices exist on the same VLAN that VSAN is operating on.

If you have multiple VSAN clusters operating on the same VLAN, it is recommended you change the broadcast address for the multicast traffic so they are not identical. This will prevent one VSAN cluster from receiving another clusters broadcasts. It can also help prevent the Misconfiguration detected error under the Network status sections of a VSAN cluster.

For a better understanding of how VSAN operates, please check out the VMware blog entry here.

For a season network professional, I highly doubt any of this was new or mind-blowing. For someone who does not generally dive into the various network protocols — but should probably start doing so — this information was both a good refresher on IGMP and helped me understand both VXLAN and VSAN a bit better.

Did I get something wrong? Let me know on Twitter.