Module 5: Lab 1 - Container Insights

Here, in this lab, as part of Module 5: Operate and Monitor, we will look at Container Insights.

Before attempting this lab, please be sure to complete the items described in the Getting Started Section.

Container Insights

Container Insights is a feature designed to monitor the performance of container workloads deployed to the cloud. It gives you performance visibility by collecting memory and processor metrics from controllers, nodes, and containers that are available in Kubernetes through the Metrics API. After you enable monitoring from Kubernetes clusters, metrics and Container logs are automatically collected for you through a containerized version of the Log Analytics agent for Linux. Metrics are sent to the metrics database in Azure Monitor. Log data is sent to your Log Analytics workspace.

Enable Container Insights

Container Insights is designed to store its data in a Log Analytics workspace. You can let the enablement process create a Log Analytics workspace for this purpose, or if you already have a workspace, you can use that one. See Designing your Azure Monitor Logs deployment to learn more about best practices for Log Analytics.

Here, let’s begin by creating a Log Analytics workspace in order to support Container Insights. Right now, we will do this using the Azure CLI. Later, we will augment our Bicep templates in order to perform this same work.

az monitor log-analytics workspace create --resource-group $resourceGroupName --workspace-name $workspaceName
WORKSPACEID=$(az monitor log-analytics workspace show --resource-group $resourceGroupName --workspace-name $workspaceName --query id -o tsv)

Now, let’s augment our cluster and enable Container Insights.

az aks enable-addons -a monitoring --resource-group $resourceGroupName --name $clusterName --workspace-resource-id $WORKSPACEID

Let’s verify that the Container Insights agent and solution were successfully deployed. First, we’ll verify the daemonset was deployed:

kubectl get daemonset ama-logs --namespace=kube-system

The output should resemble the following:

NAME       DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
ama-logs   3         3         3       3            3           <none>          45m

Next, we’ll verify that the deployment was created:

kubectl get deployment ama-logs-rs --namespace=kube-system

The output should resemble the following:

NAME          READY   UP-TO-DATE   AVAILABLE   AGE
ama-logs-rs   1/1     1            1           47m

Using Container Insights

Now that Container Insights has been enabled, we can turn our attention to the Azure Portal and see the results of our labor. (It may take a few minutes for data to flow into the Log Analytics workspace.)

Within the Portal, navigate to the cluster. Once inside the cluster, check out the Monitoring section of the menu system and open the Insights tab. Here, you will be presented with a nice visualization of your cluster, showing node count, CPU, and memory utilization. You’ll also a graph showing the active pod count. These views are dynamic. You can change the time range, or even look at live data from the cluster.

Container Insights Dashboard

Aside from the Cluster view within the Insights tab, you will find lists that desscribe your cluster’s nodes, controllers and containers. You will also find a tab dedicated to reports. These data driven reports provide additional insight into your cluster nodes, resource utilization, networking, and billing.

Now, let’s take a moment and test Container Insights by applying some load to our cluster.

First, let’s create a namespace to hold our work.

kubectl create namespace containerinsightstest
kubectl config set-context --current --namespace containerinsightstest

Next, let’s run an interactive bash Pod on the cluster:

kubectl run test-shell --rm -i --tty --image ubuntu -- bash

Now, within the test-shell Pod, update, install and run stress:

apt update
apt install stress
stress -c 10

The above commands will generate a sustained CPU spike in the cluster. Return to Container Insights and view the Cluster tab. Turn on Live updates and you should see the Node CPU Utilization graph jump as a result of the stress command.

Note: it may take several minutes in order for the visualization to update and show the increased utilization of your cluster.

Container Insights Dashboard

Next, change the view by clicking on the Nodes tab. Here, you will see a summary of what’s happening inside the cluster. Notice that one of your nodes (The one running stress) should be much more busy than the others.

Container Insights Nodes Tab

Find the node that appears to be the most busy in your cluster and expand its line item. Here, you will see a list of the processes running on that node. You should see our test-shell pod running stress at the top of this list.

Container Insights Node Details

Next, change the view by clicking on the Containers tab. Here, you will be presented with a list of containers running on the cluster. Notice that our test-shell pod is at the top of the list.

Container Insights Containers Tab

Select the test-shell container and you’ll get a description of the container.

Container Insights Container Overview

Here, you can also see a live stream of the container console and events.

Container Insights Container Events

Return to test-shell and type ctrl-c to terminate stress. Then, exit the pod.

exit

Now, let’s clean our cluster:

kubectl delete namespace containerinsightstest
kubectl config set-context --current --namespace default

Additional Diagnostics (Optional)

Container Insights provides excellent visibility within our Kubernetes Clusters. However, we can get even more visibility by streaming diagnostics data into the Azure Log Analytics workspace we just created. AKS offers you the ability to stream many types of diagnostic data, including log data from various sources as well as performance metrics.

Note: If you choose to implement Azure Sentinel as your centralized security monitoring solution, then this step will be done for you automatically when you add the Azure Kubernetes Service Data Connector to Sentinel. Sentinel will connect its associated Log Analytics Workspace to your cluster’s diagnostics.

Use the following CLI command to turn begin streaming select diagnostics data into Log Analytics

CLUSTERID=$(az aks show --resource-group $resourceGroupName --name $clusterName --query id -o tsv)
echo '['>diag.config
echo '{"category": "cluster-autoscaler", "enabled": true},'>>diag.config
echo '{"category": "guard", "enabled" :true},'>>diag.config
echo '{"category": "kube-apiserver", "enabled": true},'>>diag.config
echo '{"category": "kube-audit", "enabled": true},'>>diag.config
echo '{"category": "kube-audit-admin", "enabled": true},'>>diag.config
echo '{"category": "kube-controller-manager", "enabled": true},'>>diag.config
echo '{"category": "kube-scheduler", "enabled": true}'>>diag.config
echo ']'>>diag.config

az monitor diagnostic-settings create \
--name "diag01" \
--resource "$CLUSTERID" \
--workspace "$WORKSPACEID" \
--logs @diag.config

rm diag.config

Update Bicep Templates (Optional)

Now that we have enabled Container Insights, let’s go back and update our Bicep tempaltes in order to make sure our deployment process picks up the changes.

First, add the Log Analytics workspace to the template:

// Parameters...

@description('Log Analytics Workspace name')
param workspaceName string

// Log Analytics Workspace Definition 
resource workspace 'Microsoft.OperationalInsights/workspaces@2022-10-01' = {
  name: workspaceName
  location: location
}

// Cluster Definition...

Next, adjust the AKS cluster and enable Container Insights:

// Inside Cluster Definition; add the following to properties

addonProfiles: {
    omsAgent: {
        enabled: true
        config: {
            logAnalyticsWorkspaceResourceID: workspace.id
        }
    }

    // ...
}

Finally, add in Diagnostics at the end of the template:

resource diag01 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = {
    name: 'diag01'
    scope: aks
    properties: {
        logs: [{
            category: 'cluster-autoscaler'
            enabled: true
            retentionPolicy: {
                days: 0
                enabled: false
            }
        }, {
            category: 'guard'
            enabled: true
            retentionPolicy: {
                days: 0
                enabled: false
            }
        }, {
            category: 'kube-apiserver'
            enabled: true
            retentionPolicy: {
                days: 0
                enabled: false
            }
        },
        {
            category: 'kube-audit'
            enabled: true
            retentionPolicy: {
                days: 0
                enabled: false
            } 
        }, {
            category: 'kube-audit-admin'
            enabled: true
            retentionPolicy: {
                days: 0
                enabled: false
            }
        }, {
            category: 'kube-controller-manager'
            enabled: true
            retentionPolicy: {
                days: 0
                enabled: false
            }
        }, {
            category: 'kube-scheduler'
            enabled: true
            retentionPolicy: {
                days: 0
                enabled: false
            }
        }]
        workspaceId: workspace.id
    }
}

Conclusion

This completes Lab 1 - Container Insights. If you would like, you may continue by completing Lab 2 - Azure Policy for Kubernetes, Lab 3 - Defender for Containers, or return to the Introduction.