Below is a list of commands you can use to view relevant logs of aad-pod-identity components.
You can use grep ^E
and --since
flag from kubectl
to isolate any errors occurred after a given duration.
kubectl logs -l component=mic --since=1h | grep ^E
kubectl logs -l component=nmi --since=1h | grep ^E
It is always a good idea to include relevant logs from MIC and NMI when opening a new issue.
To ensure that the correct iptables rule is injected to each node via the NMI pods, the following command ensures that on a given node, there exists an iptables rule where all packets with a destination IP of 169.254.169.254 (IMDS endpoint) are routed to port 2579 of the host network.
NMI_POD=$(kubectl get pod -l component=nmi -ojsonpath='{.items[?(@.spec.nodeName=="<NodeName>")].metadata.name}')
kubectl exec $NMI_POD -- iptables -t nat -S aad-metadata
The expected output should be:
-N aad-metadata
-A aad-metadata ! -s 127.0.0.1/32 -d 169.254.169.254/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination 10.240.0.34:2579
-A aad-metadata -j RETURN
You could run the following commands to validate your identity setup (assuming you have the proper AzureIdentity
and AzureIdentityBinding
deployed):
kubectl run azure-cli -it --image=mcr.microsoft.com/azure-cli --labels=aadpodidbinding=<selector defined in AzureIdentityBinding> /bin/bash
# within the azure-cli shell
az login --identity --allow-no-subscriptions --debug
az login --identity
will use the Azure identity bound to the azure-cli
pod and perform a login to Azure via Azure CLI. If succeeded, you would have an output as below:
urllib3.connectionpool : Starting new HTTP connection (1): 169.254.169.254:80
urllib3.connectionpool : http://169.254.169.254:80 "GET /metadata/identity/oauth2/token?resource=https%3A%2F%2Fmanagement.core.windows.net%2F&api-version=2018-02-01 HTTP/1.1" 200 1667
msrestazure.azure_active_directory : MSI: Retrieving a token from http://169.254.169.254/metadata/identity/oauth2/token, with payload {'resource': 'https://management.core.windows.net/', 'api-version': '2018-02-01'}
msrestazure.azure_active_directory : MSI: Token retrieved
...
[
{
"environmentName": "AzureCloud",
"homeTenantId": "<REDACTED>",
"id": "<REDACTED>",
"isDefault": true,
"managedByTenants": [],
"name": "<REDACTED>",
"state": "Enabled",
"tenantId": "<REDACTED>",
"user": {
"assignedIdentityInfo": "MSI",
"name": "systemAssignedIdentity",
"type": "servicePrincipal"
}
}
]
Based on the logs above, Azure CLI was able to retrieve a token from http://169.254.169.254:80/metadata/identity/oauth2/token
. Its request is routed to the NMI pod that is running within the same node. Identify which node the Azure CLI pod is scheduled to by running the following command:
kubectl get pods -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
azure-cli 1/1 Running 1 12s 10.240.0.117 k8s-agentpool1-95854893-vmss000002 <none> <none>
Take a note at the node the pod is scheduled to and its IP address. Check the logs of the NMI pod that is scheduled to the same node. You should be able to see a token requested by the azure-cli pod, identified by its pod IP address 10.240.0.117
:
kubectl logs <nmi pod name>
...
I0821 18:22:50.810806 1 standard.go:72] no clientID or resourceID in request. default/azure-cli has been matched with azure identity default/demo
I0821 18:22:50.810895 1 standard.go:178] matched identityType:0 clientid:7eb6##### REDACTED #####a6a9 resource:https://management.core.windows.net/
I0821 18:22:51.348117 1 server.go:190] status (200) took 537597287 ns for req.method=GET reg.path=/metadata/identity/oauth2/token req.remote=10.240.0.117
...
Common issues or questions that users have run into when using pod identity are detailed below.
If you are using MIC v1.6.0+, you will need to ensure the correct capitalization of AzureIdentity
and AzureIdentityBinding
fields. For more information, please refer to this section.
If you received the following error message in MIC:
Code="LinkedAuthorizationFailed" Message="The client '<ClientID>' with object id '<ObjectID>' has permission to perform action 'Microsoft.Compute/<VMType>/write' on scope '<VM/VMSS scope>'; however, it does not have permission to perform action 'Microsoft.ManagedIdentity/userAssignedIdentities/assign/action' on the linked scope(s) '<UserAssignedIdentityScope>' or the linked scope(s) are invalid."
It means that your cluster service principal / managed identity does not have the correct role assignment to assign the chosen user-assigned identities to the VM/VMSS. For more information, please follow this documentation to allow your cluster service principal / managed identity to perform identity-related operation.
Past issues:
AzureAssignedIdentity
after MIC pods are deletedWith release 1.6.1
, finalizers have been added to AzureAssignedIdentity
to ensure the identities are successfully cleaned up by MIC before they’re deleted. However, in scenarios where the MIC deployment is force deleted before it has completed the clean up of identities from the underlying node, the AzureAssignedIdentity
will be left behind as it contains a finalizer.
To delete all AzureAssignedIdentity
, run the following command:
kubectl get azureassignedidentity -A -o=json | jq '.items[].metadata.finalizers=null' | kubectl apply -f -
kubectl delete azureassignedidentity --all
To delete only a specific AzureAssignedIdentity
, run the following command:
kubectl get azureassignedidentity <name> -n <namespace> -o=json | jq '.items[].metadata.finalizers=null' | kubectl apply -f -
kubectl delete azureassignedidentity <name> -n <namespace>
Past issues:
If you received the following or similar error in your application:
azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/subId/resourceGroups/rg/providers/Microsoft.Network/dnsZones?api-version=2018-05-01: StatusCode=0 -- Original Error: adal: Failed to execute the refresh request. Error = 'Get \"http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.core.windows.net%2F\": dial tcp 169.254.169.254:80: i/o timeout'
It means there is a network policy blocking egress traffic to 169.254.169.254
from the host. NMI pods run on hostNetwork
and listen on 127.0.0.1:2579
. Please ensure there is a network policy that allows traffic to 127.0.0.1:2579
. Example GlobalNetworPolicy
configuration for Calico:
kind: GlobalNetworkPolicy
apiVersion: crd.projectcalico.org/v1
metadata:
name: egress-localhost
spec:
types:
- Egress
egress:
- action: Allow
protocol: TCP
destination:
nets:
- 127.0.0.1
port: [2579]
Past issues:
Spark jobs that use AAD Pod Identity as a way to acquire tokens should add the following configurations (assuming AzureIdentity
and AzureIdentityBinding
are deployed beforehand):
...
--conf spark.kubernetes.driver.label.aadpodidbinding=<AzureIdentityBinding selector> \
--conf spark.kubernetes.executor.label.aadpodidbinding=<AzureIdentityBinding selector> \
...
Past issues:
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.