2 - Vanilla Installation
Deploy kubeflow into an AKS cluster using default settings.
Background
In this lab, you will use the Azure CLI to deploy an Azure Kubernetes Service (AKS) Automatic cluster. AKS Automatic offers a simplified, managed Kubernetes experience with automated node management, scaling, and security configurations. For more details, see the AKS Automatic documentation. Note that AKS Automatic is currently in preview, while it provides faster setup and less manual configuration, it is not recommended for production use. For production workloads or when advanced features and customization are required, use regular AKS instead.
You will then install Kubeflow using the default settings using Kustomize and create a jupyter notebook server you can easily access on your browser.
You can follow these same instructions to deploy Kubeflow on a non-automatic AKS cluster.
Instructions for Basic Deployment without TLS and with Default Password
This deployment option is for testing only. To deploy with TLS, and change default password, please click here: Deploy kubeflow with TLS.
⚠️ Warning: This deployment option would require users to have access to the kubernetes cluster. For a better deployment option that doesn’t have this restriction, uses TLS and Ingress please head to the [Deploy kubeflow with TLS] option.
Deploy AKS Automatic
Use the Azure CLI to deploy an AKS Automatic cluster.
💡Note: In order to complete this deployment, you will need to have either following permissions on Resource Group:
- Microsoft.Authorization/policyAssignments/write
- Microsoft.Authorization/policyAssignments/read.
For detailed instructions on installing AKS Automatic, please refer to the AKS Automatic installation documentation.
Login to the Azure CLI.
💡Note: If you have access to multiple subscriptions, you may need to run the following command to work with the appropriate subscription: az account set --subscription <NAME_OR_ID_OF_SUBSCRIPTION>
.
Set up your environment variables
RGNAME=kubeflow
CLUSTERNAME=kubeflow-aks-automatic
LOCATION=eastus
Create the resource group
az group create -n $RGNAME -l $LOCATION
Add or Update AKS extension
az extension add --name aks-preview
This article requires the aks-preview
Azure CLI extension version 9.0.0b4 or later.
Create an AKS Automatic cluster
az aks create \
--resource-group $RGNAME \
--name $CLUSTERNAME \
--location $LOCATION \
--sku automatic \
--generate-ssh-keys
💡Note: AKS Automatic is in Preview and requires feature to be registered in subscription.
az feature register --namespace Microsoft.ContainerService --name AutomaticSKUPreview
Connect to AKS Automatic Cluster
After the cluster is created, you can connect to it using the Azure CLI. The following command retrieves the credentials for your AKS cluster and configures kubectl
to use them.
az aks get-credentials --resource-group $RGNAME --name $CLUSTERNAME
Verify connectivity to the cluster. This should return a list of nodes.
💡Note: With AKS Automatic, you don’t need kubelogin as the cluster uses managed identity for authentication.
Deploy KubeFlow
Clone this repo which includes the kubeflow/manifests repo as Git Submodules
git clone --recurse-submodules https://github.com/Azure/kubeflow-aks.git
💡Note: The --recurse-submodules
flag helps to get manifests from git submodule linked to this repo
Change directory into the newly cloned directory
Run Kubeflow Kustomize deployment
This deployment option is for testing only. To deploy with TLS, and change default password, please click here: Deploy kubeflow with TLS.
From the root of the repo, cd
into kubeflow’s manifests
directory and make sure you are in the v1.10-branch
.
cd manifests/
git checkout v1.10-branch
cd ..
Install all of the components via a single command
cp -a deployments/vanilla manifests/vanilla
cd manifests/
while ! kustomize build vanilla | kubectl apply --server-side=true -f -; do echo "Retrying to apply resources"; sleep 10; done
💡Note: The --server-side=true
flag helps with large CRDs that may exceed annotation size limits. The retry loop handles dependency ordering issues during installation.
Once the command has completed, check the pods are ready
kubectl get pods -n cert-manager
kubectl get pods -n istio-system
kubectl get pods -n auth
kubectl get pods -n knative-eventing
kubectl get pods -n knative-serving
kubectl get pods -n kubeflow
kubectl get pods -n kubeflow-user-example-com
💡Note: Depending on the VM SKU automatic picked for your region it might scale up to 4 nodes to run all the Kubeflow components
Access the Kubeflow dashboard
Run kubectl port-forward
to access the Kubeflow dashboard
kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80
Finally, open http://localhost:8080 and login with the default user’s credentials. The default email address is user@example.com
and the default password is 12341234
Testing the deployment with a Notebook server
You can test that the deployments worked by creating a new Notebook server using the GUI.
-
Click on “Create a new Notebook”

-
Click on “+ New Notebook” in the top right corner of the resulting page
-
Enter a name for the server
-
Leave the “jupyterlab” option selected
-
Feel free to pick one of the images available, in this case we choose the default
-
Set Requested CPU to 0.5 and requested memory in Gi to 1
-
Under Data Volumes click on “+ Add new volume”
-
Expand the resulting section
-
Set the name to datavol-1. The default name provided would not work because it has characters that are not allowed
-
Set the size in Gi to 1
-
Uncheck “Use default class”
-
Choose a class from the provided options. In this case I will choose azurefile-premium
-
Choose ReadWriteMany as the Access mode. Your data volume config should look like the picture below

-
Click on “Launch” at the bottom of the page. A successful deployment should have a green checkmark under status, after 1-2 minutes.

-
Click on “Connect” to access your jupyter lab
-
Under Notebook, click on Python 3 to access your jupyter notebook and start coding
Next steps
To connect to Kubeflow applications you need to set up HTTPS. The reason is that many of our web applications (e.g., Tensorboard Web Application, Jupyter Web Application, Katib UI) use Secure Cookies, so accessing Kubeflow with HTTP over a non-localhost domain does not work.
Deploy with TLS deployment option.
3 - Authenticate Kubeflow users with Custom Password or Entra Id
Authenticating Kubeflow users on AKS with Custom Password or Entra Id
Background
In this lab, you will update the Kubeflow vanilla installation option to configure authentication using either custom users and passwords or Azure Entra ID.
Change default password
⚠️ Warning: Always Update the default password before making Kubeflow deployment accessible from outside the cluster.
To change the default password for the Kubeflow dashboard, you need to update the Dex configuration.
- First generate Password/Hashes by following steps described in
kubeflow
docs using python to generate bcrypt hash. Or for simplicity you can use an online tool like bcrypt-generator to create a new hash.
pip3 install passlib
python3 -c 'from passlib.hash import bcrypt; import getpass; print(bcrypt.using(rounds=12, ident="2y").hash(getpass.getpass()))'
Password: ***
$2y$12$XXXXXXXXXXXXXXXXXXX
- Delete existing password
kubectl delete secret dex-passwords -n auth
- Create new password secret
kubectl create secret generic dex-passwords --from-literal=DEX_USER_PASSWORD='REPLACE_WITH_HASH' -n auth
- Restart the Dex deployment to pick up the new password secret:
kubectl rollout restart deployment dex -n auth
To add more users
- update
dex
config map deployments/vanilla/dex-config-map.yaml
with more entries in user array:
staticPasswords:
- email: user@example.com
hashFromEnv: DEX_USER_PASSWORD
username: user
userID: "15841185641784"
# Add more users here
- email: user2@example.com
hashFromEnv: DEX_USER2_PASSWORD
username: user2
userID: "15841185641785"
- Update
DEX_USER2_PASSWORD
with the new password hash.
kubectl patch secret dex-passwords -n auth --type='json' -p='[{"op": "replace", "path": "/data/DEX_USER2_PASSWORD", "value":"'$(echo -n 'REPLACE_WITH_HASH' | base64)'"}]'
- Apply config map and restart deployment
kubectl apply -f deployments/vanilla/dex-config-map.yaml
kubectl rollout restart deployment dex -n auth
Note: if need to update the default email address, change the params file located at manifests\common\user-namespace\base\params.env
before installing Kubeflow.
Entra ID Configuration
4 - Deploy Kubeflow with Password, Ingress and TLS
Deploying Kubeflow on AKS with Custom Password and TLS
Background
In this lab, you will use the Azure CLI to deploy an Azure Kubernetes Service (AKS) Automatic cluster. AKS Automatic offers a simplified, managed Kubernetes experience with automated node management, scaling, and security configurations. For more details, see the AKS Automatic documentation. Note that AKS Automatic is currently in preview, while it provides faster setup and less manual configuration, it is not recommended for production use. For production workloads or when advanced features and customization are required, use regular AKS instead.
You will then install Kubeflow with a custom password and TLS configuration. This deployment option uses a self-signed certificate and an ingress controller. Replace the self-signed certificate with your own CA certs for production workloads.
You can follow these same instructions to deploy Kubeflow on a non-automatic AKS cluster.
DeployAKS Automatic
Deploy AKS Automatic
Deploy AKS Automatic
Use the Azure CLI to deploy an AKS Automatic cluster.
💡Note: In order to complete this deployment, you will need to have either following permissions on Resource Group:
- Microsoft.Authorization/policyAssignments/write
- Microsoft.Authorization/policyAssignments/read.
For detailed instructions on installing AKS Automatic, please refer to the AKS Automatic installation documentation.
Login to the Azure CLI.
💡Note: If you have access to multiple subscriptions, you may need to run the following command to work with the appropriate subscription: az account set --subscription <NAME_OR_ID_OF_SUBSCRIPTION>
.
Set up your environment variables
RGNAME=kubeflow
CLUSTERNAME=kubeflow-aks-automatic
LOCATION=eastus
Create the resource group
az group create -n $RGNAME -l $LOCATION
Add or Update AKS extension
az extension add --name aks-preview
This article requires the aks-preview
Azure CLI extension version 9.0.0b4 or later.
Create an AKS Automatic cluster
az aks create \
--resource-group $RGNAME \
--name $CLUSTERNAME \
--location $LOCATION \
--sku automatic \
--generate-ssh-keys
💡Note: AKS Automatic is in Preview and requires feature to be registered in subscription.
az feature register --namespace Microsoft.ContainerService --name AutomaticSKUPreview
Connect to AKS Automatic Cluster
After the cluster is created, you can connect to it using the Azure CLI. The following command retrieves the credentials for your AKS cluster and configures kubectl
to use them.
az aks get-credentials --resource-group $RGNAME --name $CLUSTERNAME
Verify connectivity to the cluster. This should return a list of nodes.
💡Note: With AKS Automatic, you don’t need kubelogin as the cluster uses managed identity for authentication.
Deploy Kubeflow with Password, Ingress and TLS
Clone this repo which includes the kubeflow/manifests repo as Git Submodules
git clone --recurse-submodules https://github.com/Azure/kubeflow-aks.git
💡Note: The --recurse-submodules
flag helps to get manifests from git submodule linked to this repo
Change directory into the newly cloned directory
From the root of the repo, ensure you’re using the v1.10-branch:
cd manifests/
git checkout v1.10-branch
cd ..
- Copy the TLS deployment files:
cp -a deployments/tls manifests/tls
⚠️ Warning: For production deployments, configure Azure AD integration instead of static passwords.
In the next steps generate password hash for your custom password and replace it in the dex-passwords.yaml file.
First generate Password/Hash by following steps described in kubeflow
docs using python to generate bcrypt hash. Or for simplicity you can use an online tool like bcrypt-generator to create a new hash.
PASSWORD="your_custom_password"
PASSWORD_HASH=$(python3 -c "import bcrypt; print(bcrypt.hashpw(b'$PASSWORD', bcrypt.gensalt()).decode())")
Update the password hash in the manifests/tls/dex-passwords.yaml
secret:
sed -i "s|<YOUR_DEX_USER_PASSWORD>|$PASSWORD_HASH|g" manifests/tls/dex-passwords.yaml
💡Warning: Always change default password before exposing Kubeflow dashboard externally.
Install Kubeflow
cd manifests/
while ! kustomize build tls | kubectl apply --server-side=true -f -; do echo "Retrying to apply resources"; sleep 10; done
💡Note: The --server-side=true
flag helps with large CRDs that may exceed annotation size limits. The retry loop handles dependency ordering issues during installation.
- Once the command has completed, check the pods are ready
kubectl get pods -n cert-manager
kubectl get pods -n istio-system
kubectl get pods -n auth
kubectl get pods -n knative-eventing
kubectl get pods -n knative-serving
kubectl get pods -n kubeflow
kubectl get pods -n kubeflow-user-example-com
💡Note: Depending on the VM SKU automatic picked for your region it might scale up to 4 nodes to run all the Kubeflow components
Expose the Kubeflow dashboard using Ingress with TLS
There are couple options to expose your Kubeflow cluster with proper HTTPS using Ingress. See note in Kubeflow docs NodePort / LoadBalancer / Ingress In this example we will use the nginx ingress controller which is included as part of the app-routing-system addon in AKS Automatic.
Step 1: Create TLS Certificate
We can create a self-signed certificate for the Kubeflow with IP available on Nginx ingress LoadBalancer or assign DNS Label
Step 1: Find IP or DNS Label of Nginx ingress
NGINX_IP=$(kubectl get svc -n app-routing-system -o jsonpath='{.items[?(@.spec.type=="LoadBalancer")].status.loadBalancer.ingress[0].ip}')
echo "Nginx IP: $NGINX_IP"
- Optional: Use Azure DNS for a friendly URL
You can also configure a custom domain name assigned to the Nginx ingress service using Azure DNS:
kubectl annotate service nginx -n app-routing-system \
service.beta.kubernetes.io/azure-dns-label-name=my-kubeflow-cluster
This will make Kubeflow accessible at: my-kubeflow-cluster.$LOCATION.cloudapp.azure.com
💡Note: DNS Label must be unique for the Azure region.
Step 2: Create TLS Certificate
If using IP address create following certificate:
echo "apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: kubeflow-tls-cert
namespace: app-routing-system
spec:
secretName: kubeflow-tls-secret
ipAddresses:
- $NGINX_IP
isCA: false
issuerRef:
name: kubeflow-self-signing-issuer
kind: ClusterIssuer
group: cert-manager.io" | kubectl apply -f -
If using DNS label use following definition (replace my-kubeflow-cluster
with your unique dns label)
echo "apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: kubeflow-tls-cert
namespace: app-routing-system
spec:
secretName: kubeflow-tls-secret
dnsNames:
- my-kubeflow-cluster.$LOCATION.cloudapp.azure.com
isCA: false
issuerRef:
name: kubeflow-self-signing-issuer
kind: ClusterIssuer
group: cert-manager.io" | kubectl apply -f -
kubectl apply -f tls/certificate.yaml
Wait for the certificate to be ready:
kubectl wait --for=condition=Ready certificate/kubeflow-tls-cert -n istio-system --timeout=300s
Create and apply an ingress manifest to expose the Kubeflow components:
echo 'apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: kubeflow-ingress
namespace: istio-system
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: "false"
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
nginx.ingress.kubernetes.io/ssl-protocols: "TLSv1.2 TLSv1.3"
spec:
ingressClassName: webapprouting.kubernetes.azure.com
tls:
- secretName: kubeflow-tls-secret
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: istio-ingressgateway
port:
number: 80' | kubectl apply -f -
Verify Ingress:
kubectl get ingress kubeflow-ingress -n istio-system
Wait for the ADDRESS
field to show an external IP address (this may take a few minutes).
NAME CLASS HOSTS ADDRESS PORTS AGE
kubeflow-ingress webapprouting.kubernetes.azure.com * xxx.149.0.222 443 16m
Access Kubeflow Dashboard
You can now access the Kubeflow dashboard at https://$NGINX_IP
or https://my-kubeflow-cluster.$LOCATION.cloudapp.azure.com
if DNS was configured.
⚠️ Warning: You will see a browser warning about the self-signed certificate. Click “Advanced” and proceed to access the dashboard.
Log in using:
- Email: user@example.com (or the email you configured)
- Password: The password you used to generate the hash
Testing the deployment with a Notebook server
You can test that the deployments worked by creating a new Notebook server using the GUI.
- Click on “Create a new Notebook” on the Kubeflow dashboard

- Click on “+ New Notebook” in the top right corner of the resulting page
- Enter a name for the server
- Leave the “jupyterlab” option selected
- Feel free to pick one of the images available, in this case we choose the default
- Set Requested CPU to 0.5 and requested memory in Gi to 1
- Under Data Volumes click on “+ Add new volume”
- Expand the resulting section
- Set the name to datavol-1. The default name provided would not work because it has characters that are not allowed
- Set the size in Gi to 1
- Uncheck “Use default class”
- Choose a class from the provided options. In this case I will choose “azurefile-premium”
- Choose ReadWriteMany as the Access mode. Your data volume config should look like the picture below

- Click on “Launch” at the bottom of the page. A successful deployment should have a green checkmark under status, after 1-2 minutes.

- Click on “Connect” to access your jupyter lab
- Under Notebook, click on Python 3 to access your jupyter notebook and start coding
Destroy the resources
Run the command below to destroy the resources you just created after you are done testing
az group delete -n $RGNAME