Virtual Machine Scale Set automatic repair policy is not enabled#
Reliability · Virtual Machine Scale Sets · Rule · 2025_06 · Important
Applications or infrastructure relying on a virtual machine scale sets may fail if VM instances are unhealthy.
Description#
Virtual Machine Scale Sets (VMSS) provide management and scaling automation for a set of virtual machine (VM) instances. One feature of VMSS is the ability to detect and automatically repair unhealthy VM instances. When enabled, automatic instance repairs helps achieve reliability by maintaining a set of healthy instances. Automatic instance repairs does this by:
- Monitoring the health of VM instances.
- Performing pre-configured repair action when an unhealthy instance is detected.
To enable automatic instance repairs, you must:
- Configure application health monitoring with either the Application Health extension or Load balancer health probes.
- Configure the
automaticRepairsPolicy
property in the VMSS resource definition. This policy configuration allows automatic repairs to be enabled, a grace time, and the repair action to be configured.
See documentation references below for additional details.
Recommendation#
Consider enabling automatic instance repair of unhealthy VM instances in a virtual machine scale set.
Examples#
Configure with Bicep#
To deploy virtual machine scale sets that pass this rule:
- Set the
properties.automaticRepairsPolicy.enabled
property totrue
. - Optionally:
- Set the
properties.automaticRepairsPolicy.gracePeriod
property to specify a grace period before repairs are initiated. By default, the grace period is 10 minutes. The grace period should be set to an interval that allows the instance to stabilize after initial deployment. - Set the
properties.automaticRepairsPolicy.repairAction
property to specify the repair action. The default repair action isReplace
, which deletes and recreates the unhealthy instance with a new one. Other repair actions includeReimage
andRestart
.
- Set the
For example:
resource vmss 'Microsoft.Compute/virtualMachineScaleSets@2024-11-01' = {
name: name
location: location
identity: {
type: 'SystemAssigned'
}
sku: {
name: 'Standard_D8ds_v6'
tier: 'Standard'
capacity: 3
}
properties: {
overprovision: true
upgradePolicy: {
mode: 'Automatic'
}
automaticRepairsPolicy: {
enabled: true
gracePeriod: 'PT10M'
repairAction: 'Replace'
}
singlePlacementGroup: true
virtualMachineProfile: {
storageProfile: {
osDisk: {
caching: 'ReadWrite'
createOption: 'FromImage'
}
imageReference: {
publisher: 'MicrosoftCblMariner'
offer: 'azure-linux-3'
sku: 'azure-linux-3-gen2-fips'
version: 'latest'
}
}
osProfile: {
adminUsername: adminUsername
computerNamePrefix: 'vmss-01'
linuxConfiguration: {
disablePasswordAuthentication: true
provisionVMAgent: true
ssh: {
publicKeys: [
{
path: '/home/azureuser/.ssh/authorized_keys'
}
]
}
}
}
networkProfile: {
networkInterfaceConfigurations: [
{
name: 'vmss-001'
properties: {
primary: true
enableAcceleratedNetworking: true
ipConfigurations: [
{
name: 'ipconfig1'
properties: {
primary: true
subnet: {
id: subnetId
}
privateIPAddressVersion: 'IPv4'
loadBalancerBackendAddressPools: [
{
id: backendPoolId
}
]
}
}
]
}
}
]
}
}
}
zones: [
'1'
'2'
'3'
]
}
Configure with Azure Verified Modules
A pre-validated module supported by Microsoft is available from the Azure Bicep public registry. To reference the module, please use the following syntax:
To use the latest version:
Configure with Azure template#
To deploy virtual machine scale sets that pass this rule:
- Set the
properties.automaticRepairsPolicy.enabled
property totrue
. - Optionally:
- Set the
properties.automaticRepairsPolicy.gracePeriod
property to specify a grace period before repairs are initiated. By default, the grace period is 10 minutes. The grace period should be set to an interval that allows the instance to stabilize after initial deployment. - Set the
properties.automaticRepairsPolicy.repairAction
property to specify the repair action. The default repair action isReplace
, which deletes and recreates the unhealthy instance with a new one. Other repair actions includeReimage
andRestart
.
- Set the
For example:
{
"type": "Microsoft.Compute/virtualMachineScaleSets",
"apiVersion": "2024-11-01",
"name": "[parameters('name')]",
"location": "[parameters('location')]",
"identity": {
"type": "SystemAssigned"
},
"sku": {
"name": "Standard_D8ds_v6",
"tier": "Standard",
"capacity": 3
},
"properties": {
"overprovision": true,
"upgradePolicy": {
"mode": "Automatic"
},
"automaticRepairsPolicy": {
"enabled": true,
"gracePeriod": "PT10M",
"repairAction": "Replace"
},
"singlePlacementGroup": true,
"virtualMachineProfile": {
"storageProfile": {
"osDisk": {
"caching": "ReadWrite",
"createOption": "FromImage"
},
"imageReference": {
"publisher": "MicrosoftCblMariner",
"offer": "azure-linux-3",
"sku": "azure-linux-3-gen2-fips",
"version": "latest"
}
},
"osProfile": {
"adminUsername": "[parameters('adminUsername')]",
"computerNamePrefix": "vmss-01",
"linuxConfiguration": {
"disablePasswordAuthentication": true,
"provisionVMAgent": true,
"ssh": {
"publicKeys": [
{
"path": "/home/azureuser/.ssh/authorized_keys"
}
]
}
}
},
"networkProfile": {
"networkInterfaceConfigurations": [
{
"name": "vmss-001",
"properties": {
"primary": true,
"enableAcceleratedNetworking": true,
"ipConfigurations": [
{
"name": "ipconfig1",
"properties": {
"primary": true,
"subnet": {
"id": "[parameters('subnetId')]"
},
"privateIPAddressVersion": "IPv4",
"loadBalancerBackendAddressPools": [
{
"id": "[parameters('backendPoolId')]"
}
]
}
}
]
}
}
]
}
}
},
"zones": [
"1",
"2",
"3"
]
}
Links#
- RE:07 Self-preservation
- Automatic instance repairs
- Using Application Health extension with Virtual Machine Scale Sets
- Azure Load Balancer health probes
- Configure a Virtual Machine Scale Set with an existing Azure Standard Load Balancer
- Azure resource deployment