vSAN TRIM/UNMAP Functionality

With my new role, I am once again drinking from the firehose. The result of which has primarily seen me go heads-down on all things vSAN related, including a course on the new features included in the vSphere 6.7 U1 release.

The latest vSAN release included in vSphere 6.7 U1, includes new functionality to support TRIM/UNMAP guest OS commands to reclaim unused disk space within a vSAN datastore. Previous versions of vSAN disregarded ATA TRIM and SCSI UNMAP commands issued by a virtual machine guest OS. The added support in vSphere 6.7 U1 helps with reclaiming unused disk space that are marked as such by a guest OS.

There are a few requirements for leveraging the new functionality.

  1. VMs must be thin-provisioned.
  2. Windows OS VMs must be at least HW version 9.
  3. Linux OS VMs must be at least HW version 13.
  4. The vSAN datastore must be On-Disk format version 7.
  5. VMs must be powered cycled after TRIM/UNMAP is enabled on the cluster or ESXi host. A reboot is not sufficient.

When enabled, the TRIM/UNMAP functionality will begin to reclaim space object on each vSAN disk group. If the vSAN cluster is leveraging deduplication, the work behind-the-scenes will potentially impact performance as there are more operations occurring transparent to the consumer. The performance latency values for TRIM/UNMAP commands can be viewed on each ESXi host in the /usr/lib/vmware/vsan/bin/vsanTraceReader log file.

To enable the TRIM/UNMAP functionality on a vSAN cluster or ESXi host, the following commands should be executed.

ESXi Host

To enable the functionality:
$ esxcfg-advcfg -s 1 /VSAN/GuestUnmap

To verify the functionality:
$ esxcfg-advcfg -g /VSAN/GuestUnmap

To disable the functionality:
$ esxcfg-advcfg -s 0 /VSAN/GuestUnmap

vSAN Cluster

Using the RVC, to enable the functionality:
> vsan.unmap_support -e ~CLUSTER_NAME

To verify the functionality:
> vsan.unmap_support ~CLUSTER_NAME

To disable the functionality:
> vsan.unmap_support -d ~CLUSTER_NAME

Once enabled on all the hosts within the vSAN cluster, and the VMs have all undergone a power cycle operation, there is one more thing to consider. There are two methods for the TRIM/UNMAP functionality to actually reclaim unused space as reported by the guest OS — Passive and Active.

Passive

  • For a Microsoft Windows Server 2012+ operating systems, it is enabled by default and reclaim operations are performed automatically.
  • For Linux operating systems, it is not enabled by default unless the filesystem has been mounted with the discard parameter.

Active

  • For a Microsoft Windows Server operating system, the Optimize Drive Utility must be leveraged.
  • For a Linux operating system, the fstrim command must be leveraged.

Enjoy!

Claim vSAN Capacity Disks for VCF 3.0

The latest release of VMware Cloud Foundation (VCF 3.0) removed the host imaging functionality. As past of the laundry list of pre-requisites for preparing an environment for VCF, one necessary step in an All-Flash vSAN environment is to mark the appropriate capacity disks.

During a POC deployment last week of VCF 3.0, this pre-requisite became evident and required a quick solution for marking the disks without having to glean all of the information manually. The following method is a quick way to identify which disks should be used for capacity and correctly allocating them as such for vSAN to claim during the VCF deployment workflows for either the Management or Workload Domain.

On the first ESXi node, we need to execute the following command to determine the capacity disk size. This command can be omitted on all remaining ESXi nodes as you prep them for VCF.

$ esxcli storage core device list
naa.58ce38ee20455a75
   Display Name: Local TOSHIBA Disk (naa.58ce38ee20455a75)
   Has Settable Display Name: true
   Size: 3662830
   Device Type: Direct-Access
   Multipath Plugin: NMP
   Devfs Path: /vmfs/devices/disks/naa.58ce38ee20455a75
   Vendor: TOSHIBA
   Model: PX05SRB384Y
   Revision: AS0C
   SCSI Level: 6
   Is Pseudo: false
   Status: on
   Is RDM Capable: true
   Is Local: true
   Is Removable: false
   Is SSD: true
   Is VVOL PE: false
   Is Offline: false
   Is Perennially Reserved: false
   Queue Full Sample Size: 0
   Queue Full Threshold: 0
   Thin Provisioning Status: yes
   Attached Filters:
   VAAI Status: unknown
   Other UIDs: vml.020000000058ce38ee20455a75505830355352
   Is Shared Clusterwide: false
   Is Local SAS Device: true
   Is SAS: true
   Is USB: false
   Is Boot USB Device: false
   Is Boot Device: false
   Device Max Queue Depth: 254
   No of outstanding IOs with competing worlds: 32
   Drive Type: physical
   RAID Level: NA
   Number of Physical Drives: 1
   Protection Enabled: false
   PI Activated: false
   PI Type: 0
   PI Protection Mask: NO PROTECTION
   Supported Guard Types: NO GUARD SUPPORT
   DIX Enabled: false
   DIX Guard Type: NO GUARD SUPPORT
   Emulated DIX/DIF Enabled: false

The above output is an example of a vSAN SSD capacity disk. The only bit of information we need to automate the rest of the work is the size of the disk. Once you have the known size, substitute the value into the first grep command and execute the following CLI script on each node.

$ esxcli storage core device list | grep -B 3 -e "Size: 3662830" | grep ^naa > /tmp/capacitydisks; for i in `cat /tmp/capacitydisks`; do esxcli vsan storage tag add -d $i -t capacityFlash;  vdq -q -d $i; done

As each disk is marked as eligible for vSAN, the script will output that information for the user.

That’s it!

If you’d like to read more about the VCF 3.0 release, please check out the DataReload blog post.

Virtual SAN IOPS Limiting

shutterstock_350473220

IOPS Limit for Virtual SAN Objects

Virtual SAN version 6.2 introduced the ability to limit the amount of IOPS a virtual machine object could consume on a per-second basis. Virtual SAN normalizes the IO size, either read or write IO, in 32 KB blocks when it performs calculations. The throttling is specified as part of the storage policy and will be applied to each virtual machine object.

The following table demonstrates how actual IOPS are calculated on a Virtual SAN virtual machine object.

VM IO Size VSAN IO Size VSAN IOPS Limit Actual IOPS Limit
32 KB 32 KB 10,000 10,000
64 KB 32 KB 10,000 5,000
128 KB 32 KB 10,000 2,500

Once a virtual machine encounters the IOPS limit specified in the storage policy, any remaining IO is delayed until the next one-second window. The current implementation creates a spikey IO profile for the virtual machine objects, which is far from ideal.

SIOC or Virtual SAN Limits?

A key differentiator between the Virtual SAN IOPS limit and Storage I/O Control, is the IOPS limit specified in the Virtual SAN storage policy is applied regardless of any congestion that may or may not exist on the datastore. As a result, a virtual machine object can be throttled even though the Virtual SAN datastore has plenty of unused or free IOPS.

As mentioned in the previous post, SIOC provides the ability to assign resource shares to a virtual machine and/or a limit. The current Virtual SAN feature is only a limit — meaning it will be enforced regardless of the overall IO resource consumption on the Virtual SAN datastore. As such, it is not a feature I have seen being regularly implemented inside a hyper-converged infrastructure (HCI) architecture. It would be nice to see the SIOC functionality added entirely to Virtual SAN in the future.

VMware Virtual SAN Hyper-converged Calculator

blueprint-header

I have been heavily involved in designing our next-generation, large-scale hyper-converged (HCI) private cloud architecture at work the past couple of months. As part of that design, we needed a way to easily calculate resources available and cluster sizes using VMware Virtual SAN. When determining the resources available and the effects of the new Virtual SAN 6.2 features, the calculations became rather complex pretty quickly. A spreadsheet was born.

The spreadsheet allows a user to input the characteristics of their HCI nodes, and from there the spreadsheet will calculate resources available per node and per cluster size (4 nodes – 64 nodes). The key assistance the spreadsheet provides is the ability to specify a VM unit that can be used to determine how many units per server are necessary to fulfill an architectures requirements. The VM unit should be based off of the workload (known or expected) that will operate within the architecture.

The spreadsheet also allows the user to input the VSAN FTT policies, VSAN reduplication efficiency factor and memory overcommitment factors — all in an effort to help the user best determine what cluster sizes should be used and how different server configurations effect the calculations.

A few key cells that should be modified by the user initially:

  • B2-B5 – HCI node CPU characteristics
  • B10 – HCI node Memory characteristic
  • B15-16,B18-19 – HCI node VSAN disk configuration
  • B22-28 – Expected/desired VSAN and cluster efficiencies. A value of 1.0 for any efficiency factor means is the baseline.

From there, the remaining cells will be updated and provide a HCI summary node box (highlighted in Yellow) and cluster nodes sizes. The user can then see what the different configurations will yield with a VSAN RAID-1, VSAN RAID-5 and VSAN RAID-6 configuration based on the values inputted in the spreadsheet.

The spreadsheet takes into consideration the number of VSAN disk groups, the ESXi system overhead for memory and CPU, and the overhead VSAN 6.2 introduces as well.

All-in-all, this has proven to be a good tool for our team as we’ve been working on our new HCI design and hopefully will be a useful tool for others as well.

The spreadsheet can be downloaded here.