Azure Proactive Resiliency Library v2
Tools Glossary GitHub GitHub Issues Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

SAP on Azure

Refer to:

  • Azure Center for SAP Solutions
  • Opensource Quality Checks
  • Openssource Inventory Checks

General Workload Guidance

Summary

RecommendationImpactCategoryAutomation AvailableIn Azure Advisor
Ensure that each SAP production system is designed for high availability across availability zonesHighHigh AvailabilityNoNo
Run SAP application servers on two or more VMs using VMSS FlexHighHigh AvailabilityNoNo
If using single-instance VMs all OS and data disks must be Premium SSD or Ultra DiskHighHigh AvailabilityYesYes
Ensure synchronous data replication (SYNC mode) between primary and secondary VM nodesHighHigh AvailabilityNoNo
Design SAP shared file systems for high availability, utilizing availability zones when possibleHighHigh AvailabilityNoNo
Test high availability solutions thoroughly to ensure fail overs work as expectedHighHigh AvailabilityNoNo
Remove unwanted location constraints from Linux Pacemaker clustersHighHigh AvailabilityNoNo
Secure compute resource capacity for critical VM roles in DR regionMediumDisaster RecoveryNoNo
Replicate production databases to DR location (ASYNC) using the vendor's replication technologyHighDisaster RecoveryNoNo
SAP components are backed up to DR location using an appropriate backup tool or ASRHighDisaster RecoveryNoNo
SAP shared files systems are replicated or backed up to DR locationHighDisaster RecoveryNoNo
Automate DR infrastructure build or pre-deploy DR resourcesMediumDisaster RecoveryNoNo
Document and test DR procedure ensure it meets RPO and RTO targetsMediumDisaster RecoveryNoNo
Ensure there is a robust monitoring and alerting solution in place for the entire DR solutionMediumDisaster RecoveryNoNo
Configure scheduled events notificationHighMonitoring and AlertingNoNo
Configure a Pacemaker cluster for SAP ASCS high availabilityHighHigh AvailabilityNoNo
Ensure the load balancer is configured correctly for SAP ASCS High availabilityHighHigh AvailabilityNoNo
Ensure the Pacemaker cluster has been setup for SAP HANA DB high availabilityHighHigh AvailabilityNoNo
Ensure the load balancer is configured correctly for SAP HANA DB High availabilityHighHigh AvailabilityNoNo
Review SAP configuration for timeout values used with Azure NetApp FilesHighHigh AvailabilityNoNo
Provision recommended storage configuration on database VMsHighScalabilityNoNo

Details


Ensure that each SAP production system is designed for high availability across availability zones

Impact:  High Category:  High Availability

APRL GUID:  a9b649a5-2bfe-40ca-9b8f-34f9c71dfa12

Description:

Azure Availability Zones are physically separate locations within each Azure region that are tolerant to local failures. Use availability zones to protect your applications and data against unlikely data center failures. Ensure each single point of failure of each SAP production system is protected with high availability using multiple availability zones. If you cannot deploy across different zones in a region, then  refer to Microsoft guidance for High availability deployment options for SAP workload.

Potential Benefits:

High availability for SAP systems
Learn More:
SAP ACSS Quality Insights
OpenSource Inventory Checks
OpenSource Quality Checks
Move Regional SAP HA to Zonal
High Availability Deployment Options for SAP

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Run SAP application servers on two or more VMs using VMSS Flex

Impact:  High Category:  High Availability

APRL GUID:  49bd34ab-d117-4b0e-99f8-34cc8a5394bc

Description:

Use Virtual Machines Scale Set (VMSS) with flexible orchestration to distribute the virtual machines across specified zones and within each zone to also distribute VMs across different fault domains within the zone on a best effort basis. Configure VMSS Flex following Microsoft recommendation for SAP workload using the right mode and correct settings. If you aren't currently using VMSS Flex for SAP application servers and also not using Availability Sets with Fault domain and Update domain distribution, then you should consider moving to VMSS Flex architecture to improve the resiliency posture of your SAP deployment. The following blog post in links below outlines the details on the process of migrating existing SAP workloads that are deployed in an availability set or availability zone to a flexible scale set with FD=1 deployment option.

Potential Benefits:

Enhanced resiliency for SAP on Azure
Learn More:
OpenSource Inventory Checks
Virtual machine Scale Set SAP Deployment Guide
Considerations for Flexible VM Scale Sets for SAP
Migrate existing SAP system VMs to VMSS Flex

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



If using single-instance VMs all OS and data disks must be Premium SSD or Ultra Disk

Impact:  High Category:  High Availability

APRL GUID:  b60ae773-9917-4bca-8a42-7cb45365a917

Description:

For single-instance VMs, both OS and data disks must be either Premium SSD or Ultra Disk to achieve the single-instance SLA of 99.9% availability.

Potential Benefits:

Higher SLA of 99.9% with SSDs
Learn More:
SAP ACSS Insights
OpenSource Inventory Checks
OpenSource Quality Checks
VM SLA
SAP Storage Planning Guide

ARG Query:

Click the Azure Resource Graph tab to view the query

// Azure Resource Graph Query
// Find all single instance VMs that have an attached disk that is not in the Premium or Ultra sku tier.

resources
| where type =~ 'Microsoft.Compute/virtualMachines'
| where isnull(properties.virtualMachineScaleSet.id)
| where isnotnull(properties.availabilitySet)
| extend lname = tolower(name)
| join kind=leftouter(resources
    | where type =~ 'Microsoft.Compute/disks'
    | where not(sku.tier =~ 'Premium') and not(sku.tier =~ 'Ultra')
    | extend lname = tolower(tostring(split(managedBy, '/')[8]))
    | project lname, name
    | summarize disks = make_list(name) by lname) on lname
| where isnotnull(disks)
| project recommendationId = "b60ae773-9917-4bca-8a42-7cb45365a917", name, id, tags, param1=strcat("AffectedDisks: ", disks)


Ensure synchronous data replication (SYNC mode) between primary and secondary VM nodes

Impact:  High Category:  High Availability

APRL GUID:  094400a5-f112-408d-a334-afd68873ff0f

Description:

High availability for databases should be implemented using database native replication technologies and the data should be replicated synchronously that is in SYNC mode from primary database to a stand-by node.

Potential Benefits:

Ensures high availability for SAP data
Learn More:
SAP ACSS Insights
OpenSource Quality Checks

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Design SAP shared file systems for high availability, utilizing availability zones when possible

Impact:  High Category:  High Availability

APRL GUID:  e09ca960-20b7-4831-b85b-83ec84c1390e

Description:

SAP shared file systems such as /sapmnt, /usr/trans, interfaces should be made highly available.
In case of Azure File Shares, we recommend that you use ZRS (Zone-redundant storage) and for Azure NetApp Files use Zonal replication for your volumes.

Potential Benefits:

Enhanced data availability for SAP
Learn More:
OpenSource Inventory Checks

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Test high availability solutions thoroughly to ensure fail overs work as expected

Impact:  High Category:  High Availability

APRL GUID:  5663a808-56be-49ea-8d5c-c5dfc6925f76

Description:

Test all high availability solutions thoroughly (including kernel panic in Linux VMs and also fail-back). Include zonal failure scenarios in your testing, the testing should confirm that each layer of your SAP solution including database, central services, application servers and shared file systems is configured correctly for zone redundancy, the solution meets RPO = 0 and the application fails over automatically meeting your RTO.
The fail back can be either automatic or manual.

Potential Benefits:

Ensures SAP Azure's failover reliability
Learn More:
Test Cases

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Remove unwanted location constraints from Linux Pacemaker clusters

Impact:  High Category:  High Availability

APRL GUID:  1b8a3051-dfd4-4780-bfb7-446296774029

Description:

When executing a migrate command in a Linux Pacemaker cluster, the system generates a temporary "prefer" location constraint, aiming to move a resource to a specified node. This constraint prioritizes the target node for the resource temporarily without permanently altering the cluster's configuration.
During planned maintenances and fail over testing, you can leverage the migrate command for temporary resource relocation during maintenance or administrative tasks to ensure minimal disruption. This constraint is not permanent and does not survive reboots or cluster resets. It's designed for short-term adjustments.
Once the planned task necessitating the resource migration is complete, manually remove the temporary constraint to revert to the cluster's original resource management policies.
This approach allows for controlled resource movement within the cluster, facilitating maintenance while preserving the integrity and efficiency of the cluster's configuration.

Potential Benefits:

Enhanced maintenance and failover handling
Learn More:
OpenSource Inventory Checks

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Secure compute resource capacity for critical VM roles in DR region

Impact:  Medium Category:  Disaster Recovery

APRL GUID:  820b4c0c-8a74-442a-8ba7-b0cb840cd983

Description:

To ensure the availability of compute resources for critical VM roles in a DR region, consider securing capacity either through a warm standby approach or by utilizing Azure's On-demand Capacity Reservation.

Warm standby involves keeping VMs in the DR region running. On-demand Capacity Reservation, on the other hand, reserves compute capacity without having to run the VMs, allowing you to start them when needed. When DR VMs are not needed, the reserved capacity may safely be used to run other workloads without the risk of losing the capacity to other customers. This strategy guarantees resource availability for your critical workloads in the event of a disaster, balancing cost and readiness.

Potential Benefits:

Guarantees DR region availability
Learn More:
Capacity Reservation

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development




Replicate production databases to DR location (ASYNC) using the vendor's replication technology

Impact:  High Category:  Disaster Recovery

APRL GUID:  fb8bdcee-d88f-408d-8572-a76a4aaa733b

Description:

Replicate production databases (ASYNC) to the DR location using the database vendor's replication technology.

Potential Benefits:

Enhanced DR resilience
Learn More:
SAP Disaster Recovery Guide

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



SAP components are backed up to DR location using an appropriate backup tool or ASR

Impact:  High Category:  Disaster Recovery

APRL GUID:  41f0d88e-7866-4444-aac4-ef5fee3e6874

Description:

SAP components such as (A)SCS, application servers, WebDispatchers, etc are backed up to DR location using an appropriate backup tool or ASR.

Potential Benefits:

Ensures SAP data safety and recovery
Learn More:
SAP ACSS Insights
OpenSource Inventory Checks

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



SAP shared files systems are replicated or backed up to DR location

Impact:  High Category:  Disaster Recovery

APRL GUID:  ee4dc309-00a1-49fe-92fa-1724baf5f103

Description:

Implementing robust monitoring and alerting for DR in SAP on Azure ensures coverage across its complex, multi-layer architecture. This strategy is crucial for databases, services, applications, and shared systems.

Potential Benefits:

Enhances SAP DR oversight
Learn More:
DR Guidance

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Automate DR infrastructure build or pre-deploy DR resources

Impact:  Medium Category:  Disaster Recovery

APRL GUID:  0fabc52e-cdbb-4acd-8626-c4c637061e2d

Description:

Automate the build of disaster recovery (DR) infrastructure (or pre-deploy DR resources) and streamline SAP service recovery as much as possible.

Potential Benefits:

Faster SAP recovery, reduced downtime
Learn More:
DR Guidance

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Document and test DR procedure ensure it meets RPO and RTO targets

Impact:  Medium Category:  Disaster Recovery

APRL GUID:  c300e949-528d-4ac9-889b-cacf8b4a6e90

Description:

Create detailed documentation of your DR procedures for each layer of the SAP architecture-database, central services, application servers, and shared file systems. This documentation should include configuration details, failover mechanisms, and step-by-step recovery procedures.

Test a wide range of failure scenarios, including regional outages. Testing should confirm that your DR strategy is robust, meets your RPO and RTO targets, and provides seamless failover across all layers of the SAP architecture. This will ensure a comprehensive and resilient DR strategy capable of withstanding regional failures and ensuring business continuity.

Potential Benefits:

Ensures robust DR, meets RPO/RTO
Learn More:
DR Guidance

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Ensure there is a robust monitoring and alerting solution in place for the entire DR solution

Impact:  Medium Category:  Disaster Recovery

APRL GUID:  c27134b7-6917-4852-8276-3dbef5c71578

Description:

For an SAP solution hosted on Azure it is imperative to implement a robust monitoring and alerting solution that comprehensively covers DR of each layer of the SAP architecture. Given the complexity of SAP systems, which span multiple layers using diverse technologies and Azure resources, each with potentially distinct DR replication mechanisms, an appropriate monitoring strategy is crucial. The different layers include database, central services, application, and shared file systems.

Potential Benefits:

Improved DR oversight and rapid issue response
Learn More:
DR Guidance

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Configure scheduled events notification

Impact:  High Category:  Monitoring and Alerting

APRL GUID:  6b589ce6-c847-4cee-af35-f6e8eb1cf983

Description:

Scheduled events is an Azure Metadata Services that provides proactive notifications about upcoming maintenance events (for example, reboot) so that your application can prepare for them and limit disruption. You should configure scheduled events for all your critical Azure VMs.

Resource agent azure-events-az can also integrate with Pacemaker clusters.

To ensure high availability and service continuity in your Azure VMs, you should configure the azure-events-az resource agent within your Pacemaker clusters. This agent monitors for scheduled Azure maintenance events and can proactively relocate resources for a graceful node shutdown. Configure the agent to monitor specific event types such as Reboot and Redeploy, and enable verbose logging for detailed diagnostics.

In addition, it is also important that you define a procedure on how to react to scheduled events.

Potential Benefits:

Proactive maintenance awareness
Learn More:
VM Scheduled Events
Configure Pacemaker for Azure Scheduled Events

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Configure a Pacemaker cluster for SAP ASCS high availability

Impact:  High Category:  High Availability

APRL GUID:  9d8f6678-694c-4da4-8384-415201f65194

Description:

For the ASCS-Pacemaker (Central Server Instance), ensure that the Pacemaker cluster configuration parameters are correctly set up for SAP ASCS high availability.

Potential Benefits:

Enhances SAP ASCS uptime
Learn More:
SAP ACSS Insights
OpenSource Quality Checks
ASCS-Pacemaker - Central Server Instance

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Ensure the load balancer is configured correctly for SAP ASCS High availability

Impact:  High Category:  High Availability

APRL GUID:  5c2e52d0-25be-4b1c-833c-b98b5ef1a26b

Description:

For the ASCS-LB (Central Server Instance), ensure that the load balancer is configured correctly for SAP ASCS high availability.

Potential Benefits:

Enhanced HA for SAP ASCS
Learn More:
SAP ACSS Insights
OpenSource Quality Checks
ASCS-LB - Central Server Instance

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Ensure the Pacemaker cluster has been setup for SAP HANA DB high availability

Impact:  High Category:  High Availability

APRL GUID:  6648fe61-880d-4a96-8d2d-190a23d5580b

Description:

For the DBHANA-Pacemaker (Database Instance), ensure that the Pacemaker cluster configuration parameters are correctly set up for SAP HANA database high availability.

Potential Benefits:

Enhances SAP HANA DB uptime
Learn More:
SAP ACSS Insights
OpenSource Quality Checks
DBHANA-Pacemaker - Database Instance

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Ensure the load balancer is configured correctly for SAP HANA DB High availability

Impact:  High Category:  High Availability

APRL GUID:  2e4c2171-a83f-4238-a8e3-b51c90d86a99

Description:

For the DBHANA-LB (Database Instance), make sure the load balancer is configured correctly for SAP HANA database high availability.

Potential Benefits:

Enhanced DB availability
Learn More:
SAP ACSS Insights
OpenSource Quality Checks
DBHANA-LB- Database Instance

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Review SAP configuration for timeout values used with Azure NetApp Files

Impact:  High Category:  High Availability

APRL GUID:  4884cada-b9c7-42d5-8153-3853e4a6f6c4

Description:

High availability of SAP while used with Azure NetApp Files relies on setting proper timeout values to prevent disruption to your application. Review the documentation to ensure your configuration meets the timeout values as noted in the documentation.

Potential Benefits:

Improve resiliency and performance of SAP on Azure
Learn More:
SAP on Azure NetApp Planning Guide

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development