Alerts Details
Specific alerts for ALZ can be downloaded by clicking on the Download icon (highlighted in red below) in the top right corner of the AMBA documentation.
The best way to see which policy alert rules are part of the ALZ pattern it is best to go to the Policy-Initiatives page.
The resources, metric alerts and their settings provide you with a starting point to help you address the following monitoring questions: “What should we monitor in Azure?” and “What alert settings should we use?” While they are opinionated settings and they are meant to cover the most common Azure Landing Zone components, we encourage you to adjust these settings to suit your monitoring needs based on how you’re using Azure.
If you have suggestions for other resources that should be included please open an Issue on this page providing the Azure resource provider and settings you’d like implemented, we can’t promise to implement them all but we will look into it. Or if you’d like to contribute directly, follow the steps on how to contribute here.
The values shown for Aggregation, Operator, Threshold, WindowSize, Frequency and Severity have been derived from field experience and what customers have implemented themselves; Alerts are based on Microsoft public guidance where available (indicated by a ‘Yes’ in the Verified column), and on practical application experience where public guidance is not available (indicated by a ‘No’ in the Verified column). Links to Product Group guidance can be found in the References column and when no guidance is provided we’ve provided a link to the description of the Metric on learn.microsoft.com.
The Scope column details where we scoped the alerts as described in Introduction to deploying the ALZ Pattern.
Only a small number of the resources support metric alert rules scoped at the subscription level and the metric alerts would only apply to resources deployed within the same region. The Support for Multiple Resources column to show which resources support metric alerts being scoped at the subscription level. For a complete list of which resources support metrics alert rules scoped at the subscription level click here.
We have tried to make it so that the table doesn’t require a lot of side to side scrolling, but it is still a lot of information, we recommended that you click on the specifc alert name which will take you directly to the JSON definition of the alert you’re interested in.
Alert Name | Component | Metric | Aggregation | Operator | Threshold | Window | Frequency | Severity | Scope | Support for Multiple Resources | Verified | References |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Deploy Automation Account TotalJob Alert | Microsoft.Automation/automationAccounts | TotalJob | Average | GreaterThan | 0 | PT5M | PT1M | 2 | Resource | No | N | Azure Automation Azure Monitor Metrics |
Deploy KeyVault Availability Alert | Microsoft.KeyVault/vaults | Availability | Average | LessThan | 90 | PT5M | PT1M | 1 | Resource | Yes | Y | Monitoring KeyVault Reference Monitoring Microsoft.KeyVault/vaults KeyVault Insights Overview |
Deploy KeyVault Capacity Alert | Microsoft.KeyVault/vaults | SaturationShoebox | Average | GreaterThan | 75 | PT5M | PT1M | 1 | Resource | Yes | Y | Monitoring KeyVault Reference Monitoring Microsoft.KeyVault/vaults KeyVault Insights Overview |
Deploy KeyVault Latency Alert | Microsoft.KeyVault/vaults | ServiceApiLatency | Average | GreaterThan | 1000 | PT5M | PT5M | 3 | Resource | Yes | Y | Monitoring KeyVault Reference Monitoring Microsoft.KeyVault/vaults KeyVault Insights Overview |
Deploy KeyVault Requests Alert | Microsoft.KeyVault/vaults | ServiceApiResult | Average | GreaterThan | dynamic | PT5M | PT5M | 2 | Resource | Yes | Y | Monitoring KeyVault Reference Monitoring Microsoft.KeyVault/vaults KeyVault Insights Overview |
Deploy Azure Application Gateway BackendLastByteResponseTime Alert | Microsoft.Network/applicationGateways | BackendLastByteResponseTime | Total | GreaterThan | dynamic | PT5M | PT1M | 2 | Resource | No | N | Monitoring Azure Application Gateway data reference Metrics for Application Gateway Monitoring Azure Application Gateway |
Deploy AFW FirewallHealth Alert | Microsoft.Network/azureFirewalls | FirewallHealth | Average | LessThan | 90 | PT5M | PT1M | 0 | Resource | No | N | Overview of Azure Firewall logs and metrics |
Deploy AFW SNATPortUtilization Alert | Microsoft.Network/azureFirewalls | SNATPortUtilization | Average | GreaterThan | 80 | PT5M | PT1M | 1 | Resource | No | N | Overview of Azure Firewall logs and metrics |
Deploy ExpressRoute Circuits ARP Availability Alert | Microsoft.Network/expressRouteCircuits | ArpAvailability | Average | LessThan | 90 | PT5M | PT1M | 0 | Resource | No | Y | Monitor ExpressRoute Alerts ExpressRoute KQL Queries |
Deploy ExpressRoute Circuits BGP Availability Alert | Microsoft.Network/expressRouteCircuits | BgpAvailability | Average | LessThan | 90 | PT5M | PT1M | 0 | Resource | No | Y | Monitor ExpressRoute Alerts ExpressRoute KQL Queries |
Deploy ExpressRoute Circuits QosDropBitsInPerSecond Alert | Microsoft.Network/expressRouteCircuits | QosDropBitsInPerSecond | Average | GreaterThan | dynamic | PT5M | PT5M | 2 | Resource | No | N | Monitor ExpressRoute Alerts ExpressRoute KQL Queries |
Deploy ExpressRoute Circuits QosDropBitsOutPerSecond Alert | Microsoft.Network/expressRouteCircuits | QosDropBitsOutPerSecond | Average | GreaterThan | dynamic | PT5M | PT5M | 2 | Resource | No | N | Monitor ExpressRoute Alerts ExpressRoute KQL Queries |
Deploy ERG ExpressRoute Bits In Alert | Microsoft.Network/expressRouteGateways | ERGatewayConnectionBitsInPerSecond | Average | LessThan | 1 | PT5M | PT5M | 0 | Resource | No | N | ExpressRoute Monitoring Metrics Alerts for ExpressRoute Gateways |
Deploy ERG ExpressRoute Bits Out Alert | Microsoft.Network/expressRouteGateways | ERGatewayConnectionBitsOutPerSecond | Average | LessThan | 1 | PT5M | PT5M | 0 | Resource | No | N | ExpressRoute Monitoring Metrics Alerts for ExpressRoute Gateways |
Deploy ERG ExpressRoute CPU Utilization Alert | Microsoft.Network/expressRouteGateways | ExpressRouteGatewayCpuUtilization | Average | GreaterThan | 80 | PT5M | PT1M | 1 | Resource | No | Y | ExpressRoute Monitoring Metrics Alerts for ExpressRoute Gateways |
Deploy ER Direct Connection BitsInPerSecond Alert | Microsoft.Network/expressRoutePorts | PortBitsInPerSecond | Average | LessThan | 1 | PT5M | PT5M | 0 | Resource | No | N | |
Deploy ER Direct Connection BitsOutPerSecond Alert | Microsoft.Network/expressRoutePorts | PortBitsOutPerSecond | Average | LessThan | 1 | PT5M | PT5M | 0 | Resource | No | N | |
Deploy ER Direct LineProtocol Alert | Microsoft.Network/expressRoutePorts | LineProtocol | Average | LessThan | 0.9 | PT5M | PT5M | 0 | Resource | No | N | |
Deploy ER Direct RxLightLevel High Alert | Microsoft.Network/expressRoutePorts | RxLightLevel | Average | GreaterThan | 0 | PT5M | PT5M | 1 | Resource | No | N | |
Deploy ER Direct RxLightLevel Low Alert | Microsoft.Network/expressRoutePorts | RxLightLevel | Average | LessThan | -10 | PT5M | PT5M | 1 | Resource | No | N | |
Deploy ER Direct TxLightLevel High Alert | Microsoft.Network/expressRoutePorts | TxLightLevel | Average | GreaterThan | 0 | PT5M | PT5M | 1 | Resource | No | N | |
Deploy ER Direct TxLightLevel Low Alert | Microsoft.Network/expressRoutePorts | TxLightLevel | Average | LessThan | -10 | PT5M | PT5M | 1 | Resource | No | N | |
Deploy ALB Data Path Availability Alert | Microsoft.Network/loadBalancers | VipAvailability | Average | LessThan | 90 | PT5M | PT1M | 0 | Resource | No | Y | Azure Monitor supported metrics by resource type - Azure Load Balancer Azure Load Balancer Multi-Demensional-Metrics Is The Data Path Up and Available for My Load-Balancer |
Deploy ALB Global Backend Availability Alert | Microsoft.Network/loadBalancers | GlobalBackendAvailability | Average | LessThan | 90 | PT5M | PT1M | 0 | Resource | No | N | Azure Monitor supported metrics by resource type - Azure Load Balancer |
Deploy ALB Health Probe Status Alert | Microsoft.Network/loadBalancers | DipAvailability | Average | LessThan | 90 | PT5M | PT1M | 0 | Resource | No | Y | Azure Monitor supported metrics by resource type - Azure Load Balancer Are Backend Instances for my Load-Balancer Responding to Probes |
Deploy ALB Used SNAT Ports Alert | Microsoft.Network/loadBalancers | UsedSNATPorts | Average | GreaterThan | 900 | PT5M | PT1M | 1 | Resource | No | Y | Azure Monitor supported metrics by resource type - Azure Load Balancer Load-Balancer Alerts Check My SNAT Port Usage and Allocation |
Deploy PDNSZ Capacity Utilization Alert | Microsoft.Network/privateDnsZones | VirtualNetworkLinkCapacityUtilization | Maximum | GreaterThanOrEqual | 80 | PT1H | PT1H | 2 | Resource | No | N | Private DNS Alert Metrics |
Deploy PDNSZ Query Volume Alert | Microsoft.Network/privateDnsZones | QueryVolume | Total | GreaterThanOrEqual | 500 | PT1H | PT1H | 4 | Resource | No | N | Private DNS Alert Metrics |
Deploy PDNSZ Record Set Capacity Alert | Microsoft.Network/privateDnsZones | RecordSetCapacityUtilization | Maximum | GreaterThanOrEqual | 80 | PT1H | PT1H | 2 | Resource | No | N | Private DNS Alert Metrics |
Deploy PDNSZ Registration Capacity Utilization Alert | Microsoft.Network/privateDnsZones | VirtualNetworkWithRegistrationCapacityUtilization | Maximum | GreaterThanOrEqual | 80 | PT1H | PT1H | 2 | Resource | No | N | Private DNS Alert Metrics |
Deploy PIP Bytes in DDoS Attack Alert | Microsoft.Network/publicIPAddresses | BytesInDDoS | Maximum | GreaterThan | 8000000 | PT5M | PT5M | 4 | Resource | No | N | Monitor Public IP Addresses Public IP Addresses Supported Metrics |
Deploy PIP DDoS Attack Alert | Microsoft.Network/publicIPAddresses | IfUnderDDoSAttack | Maximum | GreaterThan | 0 | PT5M | PT5M | 1 | Resource | No | Y | Monitor Public IP Addresses Public IP Addresses Supported Metrics |
Deploy PIP Packets in DDoS Attack Alert | Microsoft.Network/publicIPAddresses | PacketsInDDoS | Total | GreaterThanOrEqual | 40000 | PT5M | PT5M | 4 | Resource | No | N | Monitor Public IP Addresses Public IP Addresses Supported Metrics |
Deploy PIP VIP Availability Alert | Microsoft.Network/publicIPAddresses | VipAvailability | Average | LessThan | 90 | PT5M | PT1M | 1 | Resource | No | N | Monitor Public IP Addresses Public IP Addresses Supported Metrics |
Deploy VNetG Tunnel Bandwidth Alert | Microsoft.Network/virtualNetworkGateways | TunnelAverageBandwidth | Average | LessThan | 1 | PT5M | PT1M | 0 | Resource | No | N | Supported metrics for microsoft.network/virtualnetworkgateways |
Deploy VNetG Tunnel Egress Alert | Microsoft.Network/virtualNetworkGateways | TunnelEgressBytes | Average | LessThan | 1 | PT5M | PT5M | 0 | Resource | No | N | Supported metrics for microsoft.network/virtualnetworkgateways |
Deploy VNetG Egress Packet Drop Count Alert | Microsoft.Network/virtualNetworkGateways | TunnelEgressPacketDropCount | Average | GreaterThan | dynamic | PT5M | PT5M | 3 | Resource | No | N | Supported metrics for microsoft.network/virtualnetworkgateways |
Deploy VNetG Egress Packet Drop Mismatch Alert | Microsoft.Network/virtualNetworkGateways | TunnelEgressPacketDropTSMismatch | Average | GreaterThan | dynamic | PT5M | PT5M | 3 | Resource | No | N | Supported metrics for microsoft.network/virtualnetworkgateways |
Deploy VNetG ExpressRoute Bits Per Second Alert | Microsoft.Network/virtualNetworkGateways | ExpressRouteGatewayBitsPerSecond | Average | LessThan | 1 | PT5M | PT1M | 0 | Resource | No | N | Supported metrics for microsoft.network/virtualnetworkgateways |
Deploy VNetG ExpressRoute CPU Utilization Alert | Microsoft.Network/virtualNetworkGateways | ExpressRouteGatewayCpuUtilization | Average | GreaterThan | 80 | PT5M | PT1M | 1 | Resource | No | N | Supported metrics for microsoft.network/virtualnetworkgateways |
Deploy VNetG Tunnel Ingress Alert | Microsoft.Network/virtualNetworkGateways | TunnelIngressBytes | Average | LessThan | 1 | PT5M | PT5M | 0 | Resource | No | N | Supported metrics for microsoft.network/virtualnetworkgateways |
Deploy VNetG Ingress Packet Drop Count Alert | Microsoft.Network/virtualNetworkGateways | TunnelIngressPacketDropCount | Average | GreaterThan | dynamic | PT5M | PT5M | 3 | Resource | No | N | Supported metrics for microsoft.network/virtualnetworkgateways |
Deploy VNetG Egress Packet Drop Mismatch Alert | Microsoft.Network/virtualNetworkGateways | TunnelIngressPacketDropTSMismatch | Average | GreaterThan | dynamic | PT5M | PT5M | 3 | Resource | No | N | Supported metrics for microsoft.network/virtualnetworkgateways |
Deploy VNet DDoS Attack Alert | Microsoft.Network/virtualNetworks | IfUnderDDoSAttack | Maximum | GreaterThan | 0 | PT5M | PT1M | 1 | Resource | No | N | Supported metrics for Microsoft.Network/virtualNetworks |
Deploy VPNG Bandwidth Utilization Alert | Microsoft.Network/vpnGateways | TunnelAverageBandwidth | Average | LessThan | 1 | PT5M | PT1M | 0 | Resource | No | N | Activity Log Service Notifications Best practices for setting up service health alerts |
Deploy VPNG BGP Peer Status Alert | Microsoft.Network/vpnGateways | BgpPeerStatus | Total | LessThan | 1 | PT5M | PT1M | 0 | Resource | No | N | Supported metrics for microsoft.network/vpngateways |
Deploy VPNG Egress Alert | Microsoft.Network/vpnGateways | TunnelEgressBytes | Average | LessThan | 1 | PT5M | PT5M | 0 | Resource | No | N | Supported metrics for microsoft.network/vpngateways |
Deploy VPNG Egress Packet Drop Count Alert | Microsoft.Network/vpnGateways | TunnelEgressPacketDropCount | Average | GreaterThan | dynamic | PT5M | PT5M | 3 | Resource | No | N | Supported metrics for microsoft.network/vpngateways |
Deploy VPNG Egress Packet Drop Mismatch Alert | Microsoft.Network/vpnGateways | TunnelEgressPacketDropTSMismatch | Average | GreaterThan | dynamic | PT5M | PT5M | 3 | Resource | No | N | Supported metrics for microsoft.network/vpngateways |
Deploy VPNG Ingress Alert | Microsoft.Network/vpnGateways | TunnelIngressBytes | Average | LessThan | 1 | PT5M | PT5M | 0 | Resource | No | N | Supported metrics for microsoft.network/vpngateways |
Deploy VNetG Ingress Packet Drop Count Alert | Microsoft.Network/vpnGateways | TunnelIngressPacketDropCount | Average | GreaterThan | dynamic | PT5M | PT5M | 3 | Resource | No | N | Supported metrics for microsoft.network/vpngateways |
Deploy VPNG Ingress Packet Drop Mismatch Alert | Microsoft.Network/vpnGateways | TunnelIngressPacketDropTSMismatch | Average | GreaterThan | dynamic | PT5M | PT5M | 3 | Resource | No | N | Supported metrics for microsoft.network/vpngateways |
Deploy SA Availability Alert | Microsoft.Storage/storageAccounts | Availability | Average | LessThan | 100 | PT5M | PT5M | 1 | Resource | No | Y | Monitoring Availability Supported metrics for Microsoft.Storage/storageAccounts |
Deploy SA Throttling Alert | Microsoft.Storage/storageAccounts/fileServices | Transactions | Total | GreaterThanOrEqual | 1 | PT15M | PT5M | 2 | Resource | No | N | High latency, low throughput, or low IOPS |
1 See “Why are the availability alert thresholds lower than 100% in this solution when the product group documention recommends 100%?” in the FAQ for more details.
Use the following two sections to quickly know when there’s a Service Health issue with an Azure resource, saving you the effort of further troubleshooting and allow you to focus on communicating to your user base and/or use these alerts as part of your business continuity actions (remediations).
Alert Policy Name | Alert Name | targetScope | Category | Properties.cause | Properties.currentHealthStatus | Scope | Verified | References |
---|---|---|---|---|---|---|---|---|
Deploy Resource Health Unhealthy Alert | ResourceHealthUnhealthyAlert | managementGroup | ResourceHealth |
|
| Subscription | N | Resource Health Best practices for setting up service health alerts |
Alert Policy Name | Alert Name | PolicyScope | Category | Properties.incidentType | Scope | Documented | References |
---|---|---|---|---|---|---|---|
Deploy Service Health Advisory Alert | ServiceHealthAdvisoryEvent | managementGroup | ServiceHealth | ActionRequired | Subscription | Yes | Activity Log Service Notifications Best practices for setting up service health alerts |
Deploy Service Health Incident Alert | ServiceHealthIncident | managementGroup | ServiceHealth | Incident | Subscription | Yes | Activity Log Service Notifications Best practices for setting up service health alerts |
Deploy Service Health Maintenance Alert | ServiceHealthPlannedMaintenance | managementGroup | ServiceHealth | Maintenance | Subscription | Yes | Activity Log Service Notifications Best practices for setting up service health alerts |
Deploy Service Health Security Advisory Alert | ServiceHealthSecurityIncident | managementGroup | ServiceHealth | Security | Subscription | Yes | Activity Log Service Notifications Best practices for setting up service health alerts |
The following table lists a number of operational Activity Log alerts to alert your team when certain resources have been deleted.
There isn’t any per resource type guidance so what’s been provided is some general guidance on alerting on the deletion of specific resources, the list may grow in the future and of course you can create your own following the pattern used for these Activity Log alerts.
Alert Policy Name | Alert Name | PolicyScope | category | operationName | status | Scope | Documented | References |
---|---|---|---|---|---|---|---|---|
Deploy Activity Log Key Vault Delete Alert | ActivityKeyVaultDelete | managementGroup | Administrative | Microsoft.KeyVault/vaults/delete | [succeeded] | Subscription | No | Activity Log Service Notifications Best practices for setting up service health alerts |
Deploy Activity Log Azure Firewall Delete Alert | ActivityAzureFirewallDelete | managementGroup | Administrative | Microsoft.Network/azureFirewalls/delete | [succeeded] | Resource | No | Activity Log Service Notifications Best practices for setting up service health alerts |
Policy to Deploy Activity Log NSG Delete Alert | ActivityNSGDelete | managementGroup | Administrative | Microsoft.Network/networkSecurityGroups/delete | [succeeded] | Resource | No | Activity Log Service Notifications Best practices for setting up service health alerts |
Deploy Activity Log Route Table Update Alert | ActivityUDRUpdate | managementGroup | Administrative | Microsoft.Network/routeTables/routes/write | [succeeded] | Resource | No | Activity Log Service Notifications Best practices for setting up service health alerts |
Deploy Activity Log VPN Gateway Delete Alert | ActivityVPNGatewayDelete | managementGroup | Administrative | Microsoft.Network/vpnGateways/delete | [succeeded] | Subscription | No | |
Deploy Activity Log LA Workspace Delete Alert | ActivityLAWorkspaceDelete | managementGroup | Administrative | Microsoft.OperationalInsights/workspaces/delete | [succeeded] | Subscription | No | Activity Log Service Notifications Best practices for setting up service health alerts |
Deploy Activity Log LA Workspace Regenerate Key Alert | ActivityLAWorkspaceRegenKey | managementGroup | Administrative | Microsoft.OperationalInsights/workspaces/regeneratesharedkey/action | [succeeded] | Subscription | No | Activity Log Service Notifications Best practices for setting up service health alerts |
Once VM Insights has been enabled in your environment, the following alert rules can be configured for use via the Baseline Alerts framework.
N/A: Not applicable, not used in the query or used as a parameter.
Alert Name | Component | Aggregation | Operator | Threshold | WindowSize | Frequency | ResolveTime | FailingPeriods | Dimensions | Severity | Query | Verified | References |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Deploy VM Data Disk Read Latency Alert | Compute/virtualMachines | Average | GreaterThan | 25 | PT15M | PT5M | 0:10:00 |
|
| 2 |
| N | Monitor virtual machines with Azure Monitor: Alerts |
Deploy VM Data Disk Free Space Percentage Alert | Compute/virtualMachines | Average | LessThan | 10 | PT15M | PT5M | 0:10:00 |
|
| 2 |
| Y | Monitor virtual machines with Azure Monitor: Alerts |
Deploy VM Data Disk Write Latency Alert | Compute/virtualMachines | Average | GreaterThan | 25 | PT15M | PT5M | 0:10:00 |
|
| 2 |
| Y | Monitor virtual machines with Azure Monitor: Alerts |
Deploy VM Nework Read (bytes/sec) Alert | Compute/virtualMachines | Average | GreaterThan | 10000000 | PT15M | PT5M | 0:10:00 |
|
| 2 |
| Y | Monitor virtual machines with Azure Monitor: Alerts |
Deploy VM Nework Write (bytes/sec) Alert | Compute/virtualMachines | Average | GreaterThan | 10000000 | PT15M | PT5M | 0:10:00 |
|
| 2 |
| Y | Monitor virtual machines with Azure Monitor: Alerts |
Deploy VM OS Disk Read Latency Alert | Compute/virtualMachines | Average | GreaterThan | 25 | PT15M | PT5M | 0:10:00 |
|
| 2 |
| N | Monitor virtual machines with Azure Monitor: Alerts |
Deploy VM OS Disk Free Space Percentage Alert | Compute/virtualMachines | Average | LessThan | 10 | PT15M | PT5M | 0:10:00 |
|
| 2 |
| Y | Monitor virtual machines with Azure Monitor: Alerts |
Deploy VM OS Disk Write Latency Alert | Compute/virtualMachines | Average | GreaterThan | 25 | PT15M | PT5M | 0:10:00 |
|
| 2 |
| N | Monitor virtual machines with Azure Monitor: Alerts |
Deploy VM Processor Utilization Percentage Alert | Compute/virtualMachines | Average | GreaterThan | 85 | PT15M | PT5M | 0:10:00 |
|
| 2 |
| Y | Monitor virtual machines with Azure Monitor: Alerts |
Deploy VM Available Memory Percentage Alert | Compute/virtualMachines | Average | LessThan | 10 | PT15M | PT5M | 0:10:00 |
|
| 2 |
| Y | Monitor virtual machines with Azure Monitor: Alerts |
The following policy disables the classic alerts that are available in Azure Backup and enables the Azure Monitor alerts.
Security Alerts and Job Failure alerts are summarized in the “Using Backup Center” documentation.
PolicyName | Component | Category | Scope | Support for Multiple Resources | Verified | References |
---|---|---|---|---|---|---|
Deploy RV Backup Health Monitoring Alerts | Microsoft.RecoveryServices/Vaults | Microsoft.RecoveryServices/vaults/monitoringSettings.classicAlertSettings.alertsForCriticalOperations | Resource | No | Y | Azure Monitor Alerts for Azure Backup Move to Azure Monitor Alerts |