Azure Proactive Resiliency Library v2
Tools Glossary GitHub GitHub Issues Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Azure High Performance Computing

Dependent Azure Resource Recommendations

RecommendationProvider NamespaceResource Type
Monitor Batch account quotaBatchbatchAccounts
Create an Azure Batch pool across Availability ZonesBatchbatchAccounts

General Workload Guidance

Summary

RecommendationImpactCategoryAutomation AvailableIn Azure Advisor
Ensure File shares that stores jobs metadata are accessible from all head nodesHighHigh AvailabilityNoNo
Automatically grow and shrink HPC Pack cluster resourcesMediumScalabilityNoNo
Use multiple head nodes for HPC PackMediumHigh AvailabilityNoNo
Use HPC Pack Azure AD Integration or other highly available AD configurationHighHigh AvailabilityNoNo

Details


Ensure File shares that stores jobs metadata are accessible from all head nodes

Impact:  High Category:  High Availability

APRL GUID:  4c78fab4-845a-495d-ab14-3ad51de53a2a

Description:

Currently in all HPC Pack ARM templates we create the cluster share on one of the head node which is not highly available.

Potential Benefits:

Enhances job metadata availability
Learn More:
Learn More

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Automatically grow and shrink HPC Pack cluster resources

Impact:  Medium Category:  Scalability

APRL GUID:  b02b5a0e-3770-44da-a099-5dd4d9f8cd70

Description:

By deploying Azure "burst" nodes (both Windows and Linux) in your HPC Pack cluster or creating your HPC Pack cluster in Azure, you can automatically grow or shrink the cluster's resources such as nodes or cores according to the workload on the cluster.

Potential Benefits:

Efficient, uninterrupted execution
Learn More:
Learn More

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Use multiple head nodes for HPC Pack

Impact:  Medium Category:  High Availability

APRL GUID:  a48b1be6-77a3-4e3c-8205-dda2ba010a99

Description:

Establish a cluster with a minimum of two head nodes. In the event of a head node failure, the active HPC Service will be automatically transferred from the affected head node to another functioning one.

Potential Benefits:

Enhanced reliability for HPC
Learn More:
Learn More

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development



Use HPC Pack Azure AD Integration or other highly available AD configuration

Impact:  High Category:  High Availability

APRL GUID:  37eec891-7880-4759-b597-7cd925512fe3

Description:

When HPC failed to connect to the Domain controller, admin and user will not be able to connect to the HPC Service thus not able to manage and submit jobs to the cluster.

Potential Benefits:

Enhanced reliability and job management
Learn More:
Learn More

ARG Query:

Click the Azure Resource Graph tab to view the query

// under-development