2022-09: Reading Status properties from other ASO resources

Context

It isn’t currently possible to create a UserManagedIdentity and use it in the same YAML, because usage of it requires knowing its PrincipalId which is generated by Azure only when the identity is created. This is discussed in a variety of issues. See 2435 and 2350 for two examples. This is also somewhat related to 2474.

The following example highlights the problem:

apiVersion: managedidentity.azure.com/v1beta20181130
kind: UserAssignedIdentity
metadata:
  name: sample-uai
spec:
  location: Germany West Central
  owner:
    name: dev-sample-rg
---
apiVersion: authorization.azure.com/v1beta20200801preview
kind: RoleAssignment
metadata:
  name: 6a2d44f5-57d8-4916-9f46-ff7c9c1b338f
spec:
  location: Germany West Central
  owner:
    name: samplevnet
    group: network.azure.com
    kind: VirtualNetwork
  principalId: <NOT YET KNOWN>  # Problem!!
  roleDefinitionReference:
    armId: /subscriptions/00000000-0000-0000-0000-000000000000/providers/Microsoft.Authorization/roleDefinitions/b24988ac-6180-42a0-ab88-20f7382dd24c

This results in an awkward to use API and prevents ASO from fulfilling its main duty as a way to manage infrastructure as code and deploy things simply with no manual steps.

It’s worth noting that the general problem of reading a value from another resource to use in this resource comes up in more than just the AAD context.

Options

In all of the proposed solutions, an annotation would need to be added into azure-arm.yaml to indicate the property in question is special:

RoleAssignmentProperties:
  PrincipalId:
    $aadReference: true

or for the more generic proposals:

RoleAssignmentProperties:
  PrincipalId:
    $arbitraryReference: true

Option 1: Support a new type similar to ResourceReference, but specifically for AAD references

That would look something like:

// +kubebuilder:object:generate=true
type IdentityReference struct {

    // Name of the identity in Kuberentes
    Name string `json:"name,omitempty"`

    // PrincipalId of the identity in AAD. This is mutually exclusive with Name.
    PrincipalId string `json:"principalId,omitempty"`
}

Usage would be something like:

apiVersion: managedidentity.azure.com/v1beta20181130
kind: UserAssignedIdentity
metadata:
  name: sample-uai
spec:
  location: Germany West Central
  owner:
    name: dev-sample-rg
---
apiVersion: authorization.azure.com/v1beta20200801preview
kind: RoleAssignment
metadata:
  name: 6a2d44f5-57d8-4916-9f46-ff7c9c1b338f
spec:
  location: Germany West Central
  owner:
    name: samplevnet
    group: network.azure.com
    kind: VirtualNetwork
  principalId:
    name: sample-uai  # References the identity from above by name
  roleDefinitionReference:
    armId: /subscriptions/00000000-0000-0000-0000-000000000000/providers/Microsoft.Authorization/roleDefinitions/b24988ac-6180-42a0-ab88-20f7382dd24c

Pros:

  1. Relatively easy to implement
  2. Simple to use
  3. Can give a good error message if the identity in question doesn’t exist

Cons:

  1. Not generic. What if some resources require ClientId instead of PrincipalId? We would need a new type for that.

Option 2: Support a generic definition for reading from other ASO resources Status

// +kubebuilder:object:generate=true
type StringReference struct {

    Value string `json:"value,omitempty"`

    Ref *StringReferenceRef `json:"ref,omitempty"`
}

// +kubebuilder:object:generate=true
type StringReferenceRef struct {
    KubernetesResourceReference

    Path string `json:"path,omitempty"`
}

type KubernetesResourceReference struct {
    // Group is the Kubernetes group of the resource.
    Group string `json:"group,omitempty"`
    // Kind is the Kubernetes kind of the resource.
    Kind string `json:"kind,omitempty"`
    // Name is the Kubernetes name of the resource.
    Name string `json:"name,omitempty"`
}

Usage would be something like:

apiVersion: managedidentity.azure.com/v1beta20181130
kind: UserAssignedIdentity
metadata:
  name: sample-uai
spec:
  location: Germany West Central
  owner:
    name: dev-sample-rg
---
apiVersion: authorization.azure.com/v1beta20200801preview
kind: RoleAssignment
metadata:
  name: 6a2d44f5-57d8-4916-9f46-ff7c9c1b338f
spec:
  location: Germany West Central
  owner:
    name: samplevnet
    group: network.azure.com
    kind: VirtualNetwork
  principalId:
    ref:
      name: sample-uai  # Can pull any arbitrary property from another ASO resource.
      group: manangedidentity.azure.com
      kind: UserAssignedIdentity
      path: status.principalId  # In this case, getting status.principalId
  roleDefinitionReference:
    armId: /subscriptions/00000000-0000-0000-0000-000000000000/providers/Microsoft.Authorization/roleDefinitions/b24988ac-6180-42a0-ab88-20f7382dd24c

Pros:

  1. More powerful. Could be reused in the future if/when more of this sort of requirement comes up.

Cons:

  1. More verbose for users. At least in this specific example it’s a lot more writing (and possibility for mistakes) than the shorter alternatives.
  2. Errors are likely to be harder to decipher. Without something in status that says what this actually resolved to, it might be hard to figure out what exactly you sent to Azure (unless Azure specifically calls it out in the error response that shows up in the Ready Condition. There’s no guarantees that all RPs do this). We could mitigate this by putting a section in status that has what we resolved this to, although doing that will complicate the reconcile pass. Alternatively we could punt on that and see how difficult users find it. Most Azure services probably give details about what value was sent, and we could error on empty-string to protect against typos.
  3. Actually evaluating the path is tricky. We either need to write custom code to do it, or possibly we could use an existing JMESPath library but doing so would require serializing objects to JSON regularly, which could get expensive from a performance perspective.
  4. Uncertain general reconcile behavior. This reference constitutes a dependency on the referenced resource. Does this mean that ideally, the resource holding the reference will be Watch-ing the referenced resource and immediately react to changes in its dependencies? Implementing that dynamic watch behavior given the current structure of controller-runtime would be very difficult. In the worst case it would mean every ASO resource having a watch on every other resource.

Security:

  1. Risk for privilege escalation, where a user in the namespace that doesn’t have permission to read resources of type X instead asks ASO to read that resource. Azure then reflects that value back to them in the error message.

Option 3: Support reading/writing to a ConfigMap or Secret, and indirect through that

The idea here is that UserManagedIdentity would allow exporting PrincipalId to a ConfigMap value, and then the RoleAssignment resource would support (optionally) reading from that store instead of having a hardcoded value. The main difference between this and what we have now is for references we can’t replace the entire reference with a SecretReference, we need a new type like OptionalConfigReference:

We would only opt a select few resource properties into this OptionalConfigReference structure. We could (via serialization magic) support new properties adopting this shape in a non-breaking way, in case there are properties we want to promote to having this behavior after the resource is released.

// TODO: This type will support custom JSON serialization and deserialization to accept plain strings or a more 
// TODO: complicated structure (for when using as a reference). Type may change in shape in actual implementation,
// TODO: this is just an example
type OptionalConfigReference struct {
    Ref *ConfigReference `json:"ref,omitempty"`

    Value *string `json:"value,omitempty"`
}
apiVersion: managedidentity.azure.com/v1beta20181130
kind: UserAssignedIdentity
metadata:
  name: sample-uai
spec:
  location: Germany West Central
  owner:
    name: dev-sample-rg
  operatorSpec:  
    configs:  # Export the value as a configmap entry
      principalId:
        name: uai-config
        key: principalid
---
apiVersion: authorization.azure.com/v1beta20200801preview
kind: RoleAssignment
metadata:
  name: 6a2d44f5-57d8-4916-9f46-ff7c9c1b338f
spec:
  location: Germany West Central
  owner:
    name: samplevnet
    group: network.azure.com
    kind: VirtualNetwork
  principalId:
    configMap: uai-config  # Reference the config map value exported above
    key: principalid
  roleDefinitionReference:
    armId: /subscriptions/00000000-0000-0000-0000-000000000000/providers/Microsoft.Authorization/roleDefinitions/b24988ac-6180-42a0-ab88-20f7382dd24c

Pros:

  1. Generic
  2. Would solve related problems where customers want access to these variables in their pods as well. See:
    1. #2349 - Access serviceBusEndpoint in ServiceBus namespace.
    2. #2350 - Export configuration from status.
    3. #2317 - Export clientId from UserAssignedIdentity.
  3. Solves the privilege escalation risk of option 2
  4. Resources only need to watch ConfigMap, as opposed to “all other ASO resources” as in option 2.

Cons:

  1. More complicated to implement than Option 1, but less so than Option 2. The operator would need to make sure to watch the ConfigMap for changes and trigger reconciliation based on that.

Implementation

For doing dynamic serialization, see the go-task example which does this.

Decision

We will go with Option 3, using ConfigMaps and OptionalConfigReference. This is the most flexible option and solves the most user requests while also limiting downsides.

Status

Agreed. Implementation pending.

Consequences

TBC.

Experience Report

TBC.

References

TBC.