Run production workloads on two or more VMs using VMSS Flex
Impact:HighCategory:High Availability
APRL GUID:273f6b30-68e0-4241-85ea-acf15ffb60bf
Description:
Production VM workloads should be deployed on multiple VMs and grouped in a VMSS Flex instance to intelligently distribute across the platform, minimizing the impact of platform faults and updates.
Azure Availability Zones, within each Azure region, are tolerant to local failures, protecting applications and data against unlikely Datacenter failures by being physically separate.
While availability sets are not scheduled for immediate deprecation, they are planned to be deprecated in the future. Migrate workloads from VMs to VMSS Flex for deployment across zones or within the same zone across different fault domains (FDs) for better reliability.
Replicating Azure VMs via Site Recovery entails continuous, asynchronous disk replication to a target region. Recovery points are generated every few minutes, ensuring a Recovery Point Objective (RPO) in minutes.
A data disk is a managed disk attached to a virtual machine for storing database or other essential data. These disks are SCSI drives labeled as per choice.
Enable backups for your virtual machines with Azure Backup to secure and quickly recover your data. This service offers simple, secure, and cost-effective solutions for backing up and recovering data from the Microsoft Azure cloud.
Azure Virtual Machines (VM) instances have various states, like provisioning and power states. A non-running VM may indicate issues or it being unnecessary, suggesting removal could help cut costs.
Accelerated networking enables SR-IOV to a VM, greatly improving its networking performance by bypassing the host from the data path, which reduces latency, jitter, and CPU utilization for demanding network workloads on supported VM types.
When AccelNet is enabled, you must manually update the GuestOS NIC driver
Impact:LowCategory:Governance
APRL GUID:73d1bb04-7d3e-0d47-bc0d-63afe773b5fe
Description:
When Accelerated Networking is enabled, the default Azure VNet interface in GuestOS is swapped for a Mellanox, and its driver comes from a 3rd party. Marketplace images have the latest Mellanox drivers, but post-deployment, updating the driver is the user's responsibility.
Click the Azure Resource Graph tab to view the query
//cannot-be-validated-with-arg
VMs should not have a Public IP directly associated
Impact:MediumCategory:Security
APRL GUID:1f629a30-c9d0-d241-82ee-6f2eb9d42cb4
Description:
For outbound internet connectivity of Virtual Machines, using NAT Gateway or Azure Firewall is recommended to enhance security and service resilience, thanks to their higher availability and SNAT ports.
VM network interfaces and associated subnets both have a Network Security Group associated
Impact:LowCategory:Security
APRL GUID:82b3cf6b-9ae2-2e44-b193-10793213f676
Description:
Unless you have a specific reason, it's advised to associate a network security group to a subnet or a network interface, but not both, to avoid unexpected communication issues and troubleshooting due to potential rule conflicts between the two associations.
IP Forwarding should only be enabled for Network Virtual Appliances
Impact:MediumCategory:Security
APRL GUID:41a22a5e-5e08-9647-92d0-2ffe9ef1bdad
Description:
IP forwarding allows a virtual machine network interface to receive and send network traffic not destined for or originating from its assigned IP addresses.
Shared disks should only be enabled in clustered servers
Impact:MediumCategory:Other Best Practices
APRL GUID:3263a64a-c256-de48-9818-afd3cbc55c2a
Description:
Azure shared disks let you attach a disk to multiple VMs at once for deploying or migrating clustered applications, suitable only when a disk is shared among VM cluster members.
Network access to the VM disk should be set to Disable public access and enable private access
Impact:LowCategory:Security
APRL GUID:70b1d2be-e6c4-b54e-9959-b1b690f9e485
Description:
Recommended changing to "Disable public access and enable private access" and creating a Private Endpoint to improve security by restricting direct public access and ensuring connections are made privately, enhancing data protection and minimizing potential external threats.
Click the Azure Resource Graph tab to view the query
//AzureResourceGraphQuery//FindallDiskswith"Enable public access from all networks"enabledresources|wheretype=~'Microsoft.Compute/disks'|whereproperties.publicNetworkAccess=="Enabled"|projectid,name,tags,lowerCaseDiskId=tolower(id)|joinkind=leftouter(resources|wheretype=~'Microsoft.Compute/virtualMachines'|projectosDiskVmName=name,lowerCaseOsDiskId=tolower(properties.storageProfile.osDisk.managedDisk.id)|joinkind=fullouter(resources|wheretype=~'Microsoft.Compute/virtualMachines'|mv-expanddataDisks=properties.storageProfile.dataDisks|projectdataDiskVmName=name,lowerCaseDataDiskId=tolower(dataDisks.managedDisk.id))on$left.lowerCaseOsDiskId==$right.lowerCaseDataDiskId|projectlowerCaseDiskId=coalesce(lowerCaseOsDiskId,lowerCaseDataDiskId),vmName=coalesce(osDiskVmName,dataDiskVmName))onlowerCaseDiskId|summarizevmNames=make_set(vmName)byname,id,tostring(tags)|extendparam1=iif(isempty(vmNames[0]),"VMName: n/a",strcat("VMName: ",strcat_array(vmNames,", ")))|projectrecommendationId="70b1d2be-e6c4-b54e-9959-b1b690f9e485",name,id,tags,param1|orderbyidasc
Ensure that your VMs are compliant with Azure Policies
Impact:LowCategory:Governance
APRL GUID:c42343ae-2712-2843-a285-3437eb0b28a1
Description:
Keeping your virtual machine (VM) secure is crucial for the applications you run. This involves using various Azure services and features to ensure secure access to your VMs and the secure storage of your data, aiming for overall security of your VM and applications.
Virtual Machines should have Azure Disk Encryption or EncryptionAtHost enabled
Impact:HighCategory:Security
APRL GUID:f0a97179-133a-6e4f-8a49-8a44da73ffce
Description:
Consider enabling Azure Disk Encryption (ADE) for encrypting Azure VM disks using DM-Crypt (Linux) or BitLocker (Windows). Additionally, consider Encryption at host and Confidential disk encryption for enhanced data security.
VM Insights monitors VM and scale set performance, health, running processes, and dependencies. It enhances the predictability of application performance and availability by pinpointing performance bottlenecks and network issues, and it clarifies if problems are related to other dependencies.
Configure monitoring for all Azure Virtual Machines
Impact:LowCategory:Monitoring and Alerting
APRL GUID:4a9d8973-6dba-0042-b3aa-07924877ebd5
Description:
Azure Monitor Metrics automatically receives platform metrics, but platform logs, which offer detailed diagnostics and auditing for resources and their Azure platform, need to be manually routed for collection.
The maintenance configuration settings let users schedule and manage updates, making sure the updates or interruptions on the VM are performed within a planned timeframe.
Don't use A or B-Series VMs for production needing constant full CPU performance
Impact:HighCategory:Scalability
APRL GUID:3201dba8-d1da-4826-98a4-104066545170
Description:
A-series VMs are tailored for entry-level workloads like development and testing, including use cases such as development and test servers, low traffic web servers, and small to medium databases.
Mission Critical Workloads should consider using Premium or Ultra Disks
Impact:HighCategory:Scalability
APRL GUID:df0ff862-814d-45a3-95e4-4fad5a244ba6
Description:
Compared to Standard HDD and SSD, Premium SSD, SSD v2, and Ultra Disks offer improved performance, configurability, and higher single-instance VM uptime SLAs. The lowest SLA of all disks on a VM applies, so it is best to use Premium or Ultra Disks for the highest uptime SLA.
Potential Benefits:
Enhanced performance, cost efficiency, and uptime SLA
Use Azure Boost VMs for Maintenance sensitive workload
Impact:MediumCategory:High Availability
APRL GUID:9ab499d8-8844-424d-a2d4-8f53690eb8f8
Description:
If the workload is Maintenance sensitive, consider Azure Boost compatible VMs. Azure Boost is designed to lessen the impact on customers when Azure maintenance activities occur on the host.
Click the Azure Resource Graph tab to view the query
//under-development
Enable Scheduled Events for Maintenance sensitive workload VMs
Impact:MediumCategory:High Availability
APRL GUID:2de8fa5e-14f4-4c4c-857f-1520f87a629f
Description:
If your workload is Maintenance sensitive, enable Scheduled Events. This Azure Metadata Service lets your app prepare for virtual machine maintenance by providing information on upcoming events like reboots, reducing disruptions.
Click the Azure Resource Graph tab to view the query
//under-development
Use Azure Disks with Zone Redundant Storage for higher resiliency and availability
Impact:MediumCategory:High Availability
APRL GUID:fa0cf4f5-0b21-47b7-89a9-ee936f193ce1
Description:
Azure disks offers a zone-redundant storage (ZRS) option for workloads that need to be resilient to an entire zone being down. Due to the cross-zone data replication, ZRS disks have higher write latency when compared to the locally-redundant option (LRS), so make sure to benchmark your disks.
Azure Capacity Reservations ensure high availability for virtual machines by reserving compute capacity in advance within a specific region or availability zone. This guarantees that VMs will have the necessary resources during peak demand or maintenance events, enhancing reliability and uptime.
If you've installed the Azure Linux Agent or are using an endorsed distribution image, ensure your agent version is up-to-date. Some Linux distributions may disable auto-update or use older agent versions.
Click the Azure Resource Graph tab to view the query
//under-development
Reserve Compute Capacity in Disaster Recovery Regions
Impact:MediumCategory:Disaster Recovery
APRL GUID:587ca3e4-113b-4c4f-b4e0-92cd8d2065b6
Description:
On-Demand Capacity Reservations ensure recovery of virtual machines in the event of a natural disaster by reserving compute capacity in advance within a specific region or zone. This guarantees that VMs have the necessary resources during disaster recovery failover events thus reducing downtime.