ExpressRoute Circuits
The presented resiliency recommendations in this guidance include ExpressRoute circuits and associated ExpressRoute circuit settings.
Summary of Recommendations
The below table shows the list of resiliency recommendations for ExpressRoute circuits and associated resources.
Recommendation | Category | Impact | State | ARG Query Available |
---|---|---|---|---|
ERC-1 - Connect your on-premises network to critical workloads in Azure through two or more ExpressRoute circuits in different peering locations | Availability | High | Verified | No |
ERC-2 - Ensure the two physical links of your ExpressRoute circuit are connected to two distinct edge devices in your network | Availability | High | Verified | No |
ERC-3 - Ensure both connections of an ExpressRoute circuit are configured in active-active mode | Availability | High | Verified | Yes |
ERC-4 - Ensure Bidirectional Forwarding Detection is enabled and configured on customer or provider edge routing devices | Availability | High | Verified | No |
ERC-5 - Configure monitoring and alerting for ExpressRoute circuits | Monitoring | Medium | Verified | No |
ERC-6 - Configure service health to receive ExpressRoute circuit maintenance notification | Monitoring | Medium | Verified | No |
ERC-7 - Use a site-to-site VPN as an interim backup solution for a single ExpressRoute circuit | Disaster Recovery | Medium | Verified | No |
Recommendations Details
ERC-1 - Connect your on-premises network to critical workloads in Azure through two or more ExpressRoute circuits in different peering locations
Category: Availability
Impact: High
Guidance
Connect each ExpressRoute Gateway to a minimum of two circuits instantiated in different peering locations. Resources
Resource Graph Query
// under-development
ERC-2 - Ensure the two physical links of your ExpressRoute circuit are connected to two distinct edge devices in your network
Category: Availability
Impact: High
Guidance
Microsoft (in the ExpressRoute direct model) or the ExpressRoute provider (in the ExpressRoute provider-based model) always offer a physically redundant service. Make sure that the same level of physical redundancy (two physical devices, two physical links) is used across the entire path from the ExpressRoute peering location to your network.
Resources
- Designing for high availability with ExpressRoute
- Azure Well-Architected Framework review - Azure ExpressRoute - Design Checklist
Resource Graph Query
// cannot-be-validated-with-arg
ERC-3 - Ensure both connections of an ExpressRoute circuit are configured in active-active mode
Category: Availability
Impact: High
Guidance
To improve high availability, it’s recommended that you operate both the connections of an ExpressRoute circuit in active-active mode. If you configure the connections to operate in active-active mode, the Microsoft network will load balance the traffic across the connections on a per-flow basis.
Resources
Resource Graph Query
// Azure Resource Graph Query
// Goal: Show any ExpressRoute circuit where one of the connections is not configured (i.e. no IP)
Resources
| where type =~ 'Microsoft.Network/expressRouteCircuits'
| where properties.value[0].provisioningState != 'Succeeded' or properties.value[1].provisioningState != 'Succeeded'
| where not(properties.peerings[0].properties.primaryPeerAddressPrefix != "null" and properties.peerings[0].properties.secondaryPeerAddressPrefix != "null")
| project recommendationId = "erc-3", name, id, tags, param1 = strcat("Peer1_IP: ",properties.peerings[0].properties.primaryPeerAddressPrefix), param2=strcat("Peer2_IP: ", properties.peerings[0].properties.secondaryPeerAddressPrefix)
| order by id asc
ERC-4 - Ensure Bidirectional Forwarding Detection is enabled and configured on customer or provider edge routing devices
Category: Availability
Impact: High
Guidance
When you enable Bidirectional Forwarding Detection (BFD) over ExpressRoute, you can speed up the link failure detection between Microsoft Enterprise edge (MSEE) devices and the routers that your ExpressRoute circuit gets configured (CE/PE). You can configure ExpressRoute over your edge routing devices or your Partner Edge routing devices (if you went with managed Layer 3 connection service).
Resources
Resource Graph Query
// cannot-be-validated-with-arg
ERC-5 - Configure monitoring and alerting for ExpressRoute circuits
Category: Monitoring
Impact: Medium
Guidance
Configure monitoring using Network Insights for ExpressRoute circuit availability, circuit QoS, and throughput. Configure alerts for availability metrics and circuit QoS metrics according to ExpressRoute Circuits | Azure Monitor Baseline Alerts, and throughput metrics when bits/sec exceed a threshold appropriate for the ExpressRoute circuit SKU and customer usage.
Configure alerts using Connection Monitor for ExpressRoute with a Log Analytics workspace, and Network Watcher. Configure alerts for when ChecksFailedPercent exceeds 5%, and when RoundTripTimeMs exceeds a pre-tested average appropriate to the environment.
For ExpressRoute Direct, configure Traffic Collection for ExpressRoute Direct to send flow logs to a Log Analytics workspace.
Resources
- Azure ExpressRoute Insights using Network Insights | Microsoft Learn
- Monitoring Azure ExpressRoute
- Configure Traffic Collector for ExpressRoute Direct - Azure ExpressRoute | Microsoft Learn
Resource Graph Query
// under-development
ERC-6 - Configure service health to receive ExpressRoute circuit maintenance notification
Category: Monitoring
Impact: Medium
Guidance
ExpressRoute uses service health to notify about planned and unplanned maintenance. Configuring service health will notify you about changes made to your ExpressRoute circuits.
Resources
Resource Graph Query
// under-development
ERC-7 - Use a site-to-site VPN as an interim backup solution for a single ExpressRoute circuit
Category: Disaster Recovery
Impact: Medium
Guidance
If you have not yet added a second ExpressRoute circuit for an ExpressRoute Gateway, use a site-to-site VPN as an interim solution until the second ExpressRoute circuit is available.
Resources
Resource Graph Query
// under-development