Update an AKS web service with the provided properties. You can update the web service to use a new model, a new entry script, or new dependencies that can be specified in an inference configuration.
Values left as NULL
will remain unchanged in the web service.
update_aks_webservice( webservice, autoscale_enabled = NULL, autoscale_min_replicas = NULL, autoscale_max_replicas = NULL, autoscale_refresh_seconds = NULL, autoscale_target_utilization = NULL, auth_enabled = NULL, cpu_cores = NULL, memory_gb = NULL, enable_app_insights = NULL, scoring_timeout_ms = NULL, replica_max_concurrent_requests = NULL, max_request_wait_time = NULL, num_replicas = NULL, tags = NULL, properties = NULL, description = NULL, models = NULL, inference_config = NULL, gpu_cores = NULL, period_seconds = NULL, initial_delay_seconds = NULL, timeout_seconds = NULL, success_threshold = NULL, failure_threshold = NULL, namespace = NULL, token_auth_enabled = NULL )
webservice | The |
---|---|
autoscale_enabled | If |
autoscale_min_replicas | An int of the minimum number of containers to use when autoscaling the web service. |
autoscale_max_replicas | An int of the maximum number of containers to use when autoscaling the web service. |
autoscale_refresh_seconds | An int of how often in seconds the autoscaler should attempt to scale the web service. |
autoscale_target_utilization | An int of the target utilization (in percent out of 100) the autoscaler should attempt to maintain for the web service. |
auth_enabled | If |
cpu_cores | The number of cpu cores to allocate for
the web service. Can be a decimal. Defaults to |
memory_gb | The amount of memory (in GB) to allocate for
the web service. Can be a decimal. Defaults to |
enable_app_insights | If |
scoring_timeout_ms | An int of the timeout (in milliseconds) to enforce for scoring calls to the web service. |
replica_max_concurrent_requests | An int of the number of maximum concurrent requests per node to allow for the web service. |
max_request_wait_time | An int of the maximum amount of time a request will stay in the queue (in milliseconds) before returning a 503 error. |
num_replicas | An int of the number of containers to allocate for the web service. If this parameter is not set then the autoscaler is enabled by default. |
tags | A named list of key-value tags for the web service,
e.g. |
properties | A named list of key-value properties to add for the web
service, e.g. |
description | A string of the description to give the web service. |
models | A list of |
inference_config | An |
gpu_cores | An int of the number of gpu cores to allocate for the web service. |
period_seconds | An int of how often in seconds to perform the
liveness probe. Minimum value is |
initial_delay_seconds | An int of the number of seconds after the container has started before liveness probes are initiated. |
timeout_seconds | An int of the number of seconds after which the
liveness probe times out. Minimum value is |
success_threshold | An int of the minimum consecutive successes
for the liveness probe to be considered successful after having failed.
Minimum value is |
failure_threshold | An int of the number of times Kubernetes will try
the liveness probe when a Pod starts and the probe fails, before giving up.
Minimum value is |
namespace | A string of the Kubernetes namespace in which to deploy the web service: up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. The first last characters cannot be hyphens. |
token_auth_enabled | If |
None