vSAN achieves high availability and performance through distribution of data across multiple hosts in the vSAN cluster. Data is transmitted over the vSAN network. There are cases where a large amount of data must be copied over the vSAN network. During the Resync operation there are chances that the VMs on the vSAN cluster might become inaccessible if there is not enough raw capacity available on the vSAN datastore.
By default, there should be 20% of capacity available for vSAN to function optimally. When the disks have less than 20% of free space available, vSAN automatically attempts to balance the capacity utilization by moving data from the disk to other disks in the vSAN cluster.
vSAN waits for 60 minutes by default before starting any repair and rebuild operation. vSAN has this delay of 60 minutes as many issues are transient.
Changing the default time for vSAN:
You can change the default time to a longer time frame by using the below command and restart the Cluster Level Object Manager(CLOM) clomd service. These set of commands need to be run on all ESXi hosts in the vSAN cluster:
esxcli system settings advanced set -o /VSAN/ClomRepairDelay -i <value in minutes>
Note: The default 60 minutes is designed to cover a multitude of different configurations, setting the above option too aggressively can cause unnecessary resync operations to occur, when changing this advanced option consider these factors:
- Installation of ESXi updates (if performing updates)
- ESXi host boot time (Including Power On Self-Test)
- SSD Log recovery for vSAN
Changing the disk threshold for rebalancing of vSAN objects:
You can change the default rebalance threshold of 20% free capacity available on the data disks of the vSAN cluster by using the below command.
esxcfg-advcfg -s 85 /VSAN/ClomRebalanceThreshold && /etc/init.d/clomd restart
The above command will change the free capacity of vSAN disks to be 15% free. Hence vSAN will start the rebalance of objects once the overall utilisation of disks is 85% or above.