PreReconciliationChecker
Description
PreReconciliationChecker allows resources to perform validation or checks before ARM reconciliation begins. This extension is invoked before sending any requests to Azure, giving resources the ability to block reconciliation until certain prerequisites are met.
The interface is called early in the reconciliation process, after basic claim/validation but before any ARM GET/PUT/PATCH operations. It provides an opportunity to ensure that conditions necessary for successful reconciliation are satisfied.
Interface Definition
See the PreReconciliationChecker interface definition in the source code.
Motivation
The PreReconciliationChecker extension exists to handle cases where:
- Prerequisite validation: Ensuring required conditions are met before attempting ARM operations
- Owner readiness: Verifying that parent resources are in a state suitable for child creation
- Quota checking: Validating that operation won’t exceed limits
- Preventing futile operations: Blocking reconciliation attempts that cannot possibly succeed
- External dependencies: Waiting for external systems or resources to be ready
- Resource state validation: Ensuring the resource is in an appropriate state for reconciliation
The default behavior attempts reconciliation immediately. Some resources need to verify prerequisites first to avoid unnecessary ARM calls, rate limiting, or error conditions.
When to Use
Implement PreReconciliationChecker when:
- ✅ Parent/owner resources must reach certain states first
- ✅ External dependencies must be satisfied before reconciling
- ✅ Spec validation requires complex logic beyond webhooks
- ✅ Reconciliation would fail due to known prerequisites not being met
- ✅ Rate limiting or quota concerns require gating
- ✅ Resource configuration requires specific ordering
Do not use PreReconciliationChecker when:
- ❌ Simple validation can be done in admission webhooks
- ❌ The check should happen after reconciliation (use PostReconciliationChecker)
- ❌ You’re trying to modify the resource (use other extensions)
- ❌ The default behavior works correctly
Example: Waiting for Owner Ready State
A common pattern is waiting for the parent resource to be ready:
// Simplified example
func (ex *MyResourceExtension) PreReconcileCheck(
ctx context.Context,
obj genruntime.MetaObject,
owner genruntime.MetaObject,
...
) (extensions.PreReconcileCheckResult, error) {
if owner != nil {
ready := conditions.IsReady(owner)
if !ready {
return extensions.BlockReconcile(
fmt.Sprintf("waiting for owner %s to be ready", owner.GetName())), nil
}
}
return extensions.ProceedWithReconcile(), nil
}
Key aspects:
- Type assertions: For both resource type and hub version
- Owner check: Validates owner state before proceeding
- Clear blocking messages: Provides reason for blocking
- No error: Check succeeded, but reconciliation should wait
- Proceeds when ready: Returns proceed result when checks pass
Common Patterns
Pattern 1: Check Owner Ready Condition
func (ex *ResourceExtension) PreReconcileCheck(
ctx context.Context,
obj genruntime.MetaObject,
owner genruntime.MetaObject,
resourceResolver *resolver.Resolver,
armClient *genericarmclient.GenericClient,
log logr.Logger,
next extensions.PreReconcileCheckFunc,
) (extensions.PreReconcileCheckResult, error) {
resource := obj.(*myservice.MyResource)
// Ensure owner is ready before creating child
if owner == nil {
return extensions.BlockReconcile("owner not found"), nil
}
if !conditions.IsReady(owner) {
return extensions.BlockReconcile(
fmt.Sprintf("owner %s/%s not ready",
owner.GetNamespace(), owner.GetName())), nil
}
// Owner ready, call next checker in chain
return next(ctx, obj, owner, resourceResolver, armClient, log)
}
Example 1: Virtual Machine Scale Set Instances
Problem: Cannot manage individual VMSS instances when the scale set is updating or deleting.
Solution: Check VMSS state before attempting instance operations.
// Block on: Updating, Deallocating, Deleting
// Allow on: Running, Succeeded
Example 2: Container Service Agent Pools
Problem: Agent pools cannot be modified when the cluster is upgrading.
Solution: Check AKS cluster upgrade state before pool operations.
// Block on: Upgrading, Updating
// Allow on: Succeeded, Running
Pattern 2: Validate Required References
func (ex *ResourceExtension) PreReconcileCheck(
ctx context.Context,
obj genruntime.MetaObject,
owner genruntime.MetaObject,
resourceResolver *resolver.Resolver,
armClient *genericarmclient.GenericClient,
log logr.Logger,
next extensions.PreReconcileCheckFunc,
) (extensions.PreReconcileCheckResult, error) {
resource := obj.(*myservice.MyResource)
// Check if required references are ready
if resource.Spec.NetworkReference != nil {
resolved, err := resourceResolver.ResolveResourceReference(
ctx, resource.Spec.NetworkReference)
if err != nil {
return extensions.PreReconcileCheckResult{}, err
}
if !resolved.Found {
return extensions.BlockReconcile(
"required network reference not found"), nil
}
if !conditions.IsReady(resolved.Resource) {
return extensions.BlockReconcile(
"required network not ready"), nil
}
}
return extensions.ProceedWithReconcile(), nil
}
Pattern 3: Validate Configuration Prerequisites
func (ex *ResourceExtension) PreReconcileCheck(
ctx context.Context,
obj genruntime.MetaObject,
owner genruntime.MetaObject,
resourceResolver *resolver.Resolver,
armClient *genericarmclient.GenericClient,
log logr.Logger,
next extensions.PreReconcileCheckFunc,
) (extensions.PreReconcileCheckResult, error) {
resource := obj.(*myservice.MyResource)
// Complex validation that can't be done in webhook
if resource.Spec.AdvancedConfig != nil {
if err := ex.validateAdvancedConfig(resource.Spec.AdvancedConfig); err != nil {
// Configuration is invalid, block permanently
return extensions.PreReconcileCheckResult{}, conditions.NewReadyConditionImpactingError(
err,
conditions.ConditionSeverityError,
conditions.ReasonFailed)
}
}
return extensions.ProceedWithReconcile(), nil
}
Pattern 4: Check Azure Resource State
func (ex *ResourceExtension) PreReconcileCheck(
ctx context.Context,
obj genruntime.MetaObject,
owner genruntime.MetaObject,
resourceResolver *resolver.Resolver,
armClient *genericarmclient.GenericClient,
log logr.Logger,
next extensions.PreReconcileCheckFunc,
) (extensions.PreReconcileCheckResult, error) {
resource := obj.(*myservice.MyResource)
// Get resource ID if it exists
resourceID, hasID := genruntime.GetResourceID(resource)
if !hasID {
// Not claimed yet, proceed
return extensions.ProceedWithReconcile(), nil
}
// Query current state from Azure
var azureState MyResourceState
apiVersion := "2023-01-01"
_, err := armClient.GetByID(ctx, resourceID, apiVersion, &azureState)
if err != nil {
// Handle error appropriately
return extensions.PreReconcileCheckResult{}, err
}
// Check if resource is in a state that allows updates
if azureState.Status == "Locked" || azureState.Status == "Deleting" {
return extensions.BlockReconcile(
fmt.Sprintf("resource in %s state, cannot reconcile", azureState.Status)), nil
}
return extensions.ProceedWithReconcile(), nil
}
Check Result
The extension returns one of two results, or an error.
Proceed
return extensions.ProceedWithReconcile(), nil
- Prerequisites are satisfied
- Reconciliation will continue normally
- ARM operations will be attempted
Block
return extensions.BlockReconcile("reason for blocking"), nil
- Prerequisites not met
- Reconciliation skipped for now
- Resource will be requeued to try again later
- Condition set with the blocking reason
Error
return extensions.PreReconcileCheckResult{}, fmt.Errorf("check failed: %w", err)
- The check itself failed
- Error condition set on resource
- Reconciliation blocked until error resolved
Block vs. Error
Understanding the difference is important:
-
Block: “Not ready to reconcile yet, try again later” (transient)
- Example: “waiting for owner”, “resource locked”
- Returns
BlockReconcile(reason), nil - Resource requeued automatically
-
Error: “Something went wrong during the check” (needs attention)
- Example: “invalid configuration”, “failed to query Azure”
- Returns
PreReconcileCheckResult{}, error - May be permanent issue requiring user action
Reconciliation Impact
When a pre-reconciliation check blocks:
- No ARM operations: No requests sent to Azure
- Condition set: Condition added explaining why blocked
- Reconciliation requeued: Will try again automatically
- Resource state preserved: No changes made to resource
- Efficient waiting: Avoids unnecessary ARM calls
This continues until the check passes or the resource is deleted.
Testing
When testing PreReconciliationChecker extensions:
- Test proceed case: Verify check passes when prerequisites met
- Test block cases: Cover all blocking scenarios
- Test error handling: Verify proper error handling
- Test with nil owner: Handle cases with no owner
- Test check chain: Verify calling next() works correctly
Performance Considerations
Pre-reconciliation checks run on every reconciliation attempt, so:
- Keep them fast: Avoid expensive operations when possible
- Cache results: If appropriate, cache validation results
- Minimize ARM calls: Use status/cache over live queries
- Fail fast: Return quickly when blocking
- Be selective: Only check what’s necessary
Common Use Cases
- Owner/Parent Readiness: Most common - wait for parent to be ready
- Reference Resolution: Ensure referenced resources exist and are ready
- Ordering Dependencies: Ensure resources created in correct order
- State Validation: Verify resource in appropriate state for updates
- Quota/Limit Checks: Prevent operations that would exceed limits
- External System Dependencies: Wait for external systems to be ready
Important Notes
- Call
next()when checks pass: Allows for check chaining - Don’t modify the resource: This is for validation only
- Provide clear reasons: Blocking messages shown to users
- Be idempotent: Checks may run many times
- Use factory methods: Always uses the factory methods for
PreReconcileCheckResultto ensure consistency - Handle nil owner: Owner can be nil for root resources
- Use conditions package: For setting appropriate conditions
- Log decisions: Help debugging by explaining why checks block
Related Extension Points
- PreReconciliationOwnerChecker: Specialized owner checks
- PostReconciliationChecker: Check after reconciliation
Best Practices
- Validate early: Catch issues before ARM operations
- Clear messaging: Users need to understand why reconciliation is blocked
- Appropriate blocking: Only block when reconciliation would fail
- Consider performance: These run frequently
- Use conditions: Set appropriate conditions when blocking
- Handle edge cases: Nil owners, missing references, etc.
- Test thoroughly: Cover all blocking scenarios