With my new role, I am once again drinking from the firehose. The result of which has primarily seen me go heads-down on all things vSAN related, including a course on the new features included in the vSphere 6.7 U1 release.
The latest vSAN release included in vSphere 6.7 U1, includes new functionality to support TRIM/UNMAP guest OS commands to reclaim unused disk space within a vSAN datastore. Previous versions of vSAN disregarded ATA TRIM and SCSI UNMAP commands issued by a virtual machine guest OS. The added support in vSphere 6.7 U1 helps with reclaiming unused disk space that are marked as such by a guest OS.
There are a few requirements for leveraging the new functionality.
- VMs must be thin-provisioned.
- Windows OS VMs must be at least HW version 9.
- Linux OS VMs must be at least HW version 13.
- The vSAN datastore must be On-Disk format version 7.
- VMs must be powered cycled after TRIM/UNMAP is enabled on the cluster or ESXi host. A reboot is not sufficient.
When enabled, the TRIM/UNMAP functionality will begin to reclaim space object on each vSAN disk group. If the vSAN cluster is leveraging deduplication, the work behind-the-scenes will potentially impact performance as there are more operations occurring transparent to the consumer. The performance latency values for TRIM/UNMAP commands can be viewed on each ESXi host in the
/usr/lib/vmware/vsan/bin/vsanTraceReader log file.
To enable the TRIM/UNMAP functionality on a vSAN cluster or ESXi host, the following commands should be executed.
To enable the functionality:
$ esxcfg-advcfg -s 1 /VSAN/GuestUnmap
To verify the functionality:
$ esxcfg-advcfg -g /VSAN/GuestUnmap
To disable the functionality:
$ esxcfg-advcfg -s 0 /VSAN/GuestUnmap
Using the RVC, to enable the functionality:
> vsan.unmap_support -e ~CLUSTER_NAME
To verify the functionality:
> vsan.unmap_support ~CLUSTER_NAME
To disable the functionality:
> vsan.unmap_support -d ~CLUSTER_NAME
Once enabled on all the hosts within the vSAN cluster, and the VMs have all undergone a power cycle operation, there is one more thing to consider. There are two methods for the TRIM/UNMAP functionality to actually reclaim unused space as reported by the guest OS — Passive and Active.
- For a Microsoft Windows Server 2012+ operating systems, it is enabled by default and reclaim operations are performed automatically.
- For Linux operating systems, it is not enabled by default unless the filesystem has been mounted with the
- For a Microsoft Windows Server operating system, the Optimize Drive Utility must be leveraged.
- For a Linux operating system, the fstrim command must be leveraged.
The latest release of VMware Cloud Foundation (VCF 3.0) removed the host imaging functionality. As past of the laundry list of pre-requisites for preparing an environment for VCF, one necessary step in an All-Flash vSAN environment is to mark the appropriate capacity disks.
During a POC deployment last week of VCF 3.0, this pre-requisite became evident and required a quick solution for marking the disks without having to glean all of the information manually. The following method is a quick way to identify which disks should be used for capacity and correctly allocating them as such for vSAN to claim during the VCF deployment workflows for either the Management or Workload Domain.
On the first ESXi node, we need to execute the following command to determine the capacity disk size. This command can be omitted on all remaining ESXi nodes as you prep them for VCF.
$ esxcli storage core device list
Display Name: Local TOSHIBA Disk (naa.58ce38ee20455a75)
Has Settable Display Name: true
Device Type: Direct-Access
Multipath Plugin: NMP
Devfs Path: /vmfs/devices/disks/naa.58ce38ee20455a75
SCSI Level: 6
Is Pseudo: false
Is RDM Capable: true
Is Local: true
Is Removable: false
Is SSD: true
Is VVOL PE: false
Is Offline: false
Is Perennially Reserved: false
Queue Full Sample Size: 0
Queue Full Threshold: 0
Thin Provisioning Status: yes
VAAI Status: unknown
Other UIDs: vml.020000000058ce38ee20455a75505830355352
Is Shared Clusterwide: false
Is Local SAS Device: true
Is SAS: true
Is USB: false
Is Boot USB Device: false
Is Boot Device: false
Device Max Queue Depth: 254
No of outstanding IOs with competing worlds: 32
Drive Type: physical
RAID Level: NA
Number of Physical Drives: 1
Protection Enabled: false
PI Activated: false
PI Type: 0
PI Protection Mask: NO PROTECTION
Supported Guard Types: NO GUARD SUPPORT
DIX Enabled: false
DIX Guard Type: NO GUARD SUPPORT
Emulated DIX/DIF Enabled: false
The above output is an example of a vSAN SSD capacity disk. The only bit of information we need to automate the rest of the work is the size of the disk. Once you have the known size, substitute the value into the first
grep command and execute the following CLI script on each node.
$ esxcli storage core device list | grep -B 3 -e "Size: 3662830" | grep ^naa > /tmp/capacitydisks; for i in `cat /tmp/capacitydisks`; do esxcli vsan storage tag add -d $i -t capacityFlash; vdq -q -d $i; done
As each disk is marked as eligible for vSAN, the script will output that information for the user.
If you’d like to read more about the VCF 3.0 release, please check out the DataReload blog post.
IOPS Limit for Virtual SAN Objects
Virtual SAN version 6.2 introduced the ability to limit the amount of IOPS a virtual machine object could consume on a per-second basis. Virtual SAN normalizes the IO size, either read or write IO, in 32 KB blocks when it performs calculations. The throttling is specified as part of the storage policy and will be applied to each virtual machine object.
The following table demonstrates how actual IOPS are calculated on a Virtual SAN virtual machine object.
|VM IO Size
||VSAN IO Size
||VSAN IOPS Limit
||Actual IOPS Limit
Once a virtual machine encounters the IOPS limit specified in the storage policy, any remaining IO is delayed until the next one-second window. The current implementation creates a spikey IO profile for the virtual machine objects, which is far from ideal.
SIOC or Virtual SAN Limits?
A key differentiator between the Virtual SAN IOPS limit and Storage I/O Control, is the IOPS limit specified in the Virtual SAN storage policy is applied regardless of any congestion that may or may not exist on the datastore. As a result, a virtual machine object can be throttled even though the Virtual SAN datastore has plenty of unused or free IOPS.
As mentioned in the previous post, SIOC provides the ability to assign resource shares to a virtual machine and/or a limit. The current Virtual SAN feature is only a limit — meaning it will be enforced regardless of the overall IO resource consumption on the Virtual SAN datastore. As such, it is not a feature I have seen being regularly implemented inside a hyper-converged infrastructure (HCI) architecture. It would be nice to see the SIOC functionality added entirely to Virtual SAN in the future.
I have been heavily involved in designing our next-generation, large-scale hyper-converged (HCI) private cloud architecture at work the past couple of months. As part of that design, we needed a way to easily calculate resources available and cluster sizes using VMware Virtual SAN. When determining the resources available and the effects of the new Virtual SAN 6.2 features, the calculations became rather complex pretty quickly. A spreadsheet was born.
The spreadsheet allows a user to input the characteristics of their HCI nodes, and from there the spreadsheet will calculate resources available per node and per cluster size (4 nodes – 64 nodes). The key assistance the spreadsheet provides is the ability to specify a VM unit that can be used to determine how many units per server are necessary to fulfill an architectures requirements. The VM unit should be based off of the workload (known or expected) that will operate within the architecture.
The spreadsheet also allows the user to input the VSAN FTT policies, VSAN reduplication efficiency factor and memory overcommitment factors — all in an effort to help the user best determine what cluster sizes should be used and how different server configurations effect the calculations.
A few key cells that should be modified by the user initially:
- B2-B5 – HCI node CPU characteristics
- B10 – HCI node Memory characteristic
- B15-16,B18-19 – HCI node VSAN disk configuration
- B22-28 – Expected/desired VSAN and cluster efficiencies. A value of 1.0 for any efficiency factor means is the baseline.
From there, the remaining cells will be updated and provide a HCI summary node box (highlighted in Yellow) and cluster nodes sizes. The user can then see what the different configurations will yield with a VSAN RAID-1, VSAN RAID-5 and VSAN RAID-6 configuration based on the values inputted in the spreadsheet.
The spreadsheet takes into consideration the number of VSAN disk groups, the ESXi system overhead for memory and CPU, and the overhead VSAN 6.2 introduces as well.
All-in-all, this has proven to be a good tool for our team as we’ve been working on our new HCI design and hopefully will be a useful tool for others as well.
The spreadsheet can be downloaded here.