Installation

install_azureml()

Install azureml sdk package

Workspaces

Functions for managing workspace resources. A Workspace is the top-level resource for Azure Machine Learning. It provides a centralized place to work with all the artifacts you create when you use Azure ML.

create_workspace()

Create a new Azure Machine Learning workspace

get_workspace()

Get an existing workspace

service_principal_authentication()

Manages authentication using a service principle instead of a user identity.

load_workspace_from_config()

Load workspace configuration details from a config file

write_workspace_config()

Write out the workspace configuration details to a config file

get_default_datastore()

Get the default datastore for a workspace

set_default_datastore()

Set the default datastore for a workspace

delete_workspace()

Delete a workspace

list_workspaces()

List all workspaces that the user has access to in a subscription ID

get_workspace_details()

Get the details of a workspace

get_default_keyvault()

Get the default keyvault for a workspace

set_secrets()

Add secrets to a keyvault

get_secrets()

Get secrets from a keyvault

delete_secrets()

Delete secrets from a keyvault

list_secrets()

List the secrets in a keyvault

interactive_login_authentication()

Manages authentication and acquires an authorization token in interactive login workflows.

Compute targets

Functions for managing compute resources. A Compute Target is a designated compute resource where you run your scripts or host your service deployments. Compute targets make it easy to change your compute environment without changing your code. Supported compute target types in the R SDK include AmlCompute and AksCompute.

create_aml_compute()

Create an AmlCompute cluster

list_nodes_in_aml_compute()

Get the details (e.g IP address, port etc) of all the compute nodes in the compute target

update_aml_compute()

Update scale settings for an AmlCompute cluster

create_aks_compute()

Create an AksCompute cluster

get_aks_compute_credentials()

Get the credentials for an AksCompute cluster

attach_aks_compute()

Attach an existing AKS cluster to a workspace

detach_aks_compute()

Detach an AksCompute cluster from its associated workspace

get_compute()

Get an existing compute cluster

wait_for_provisioning_completion()

Wait for a cluster to finish provisioning

list_supported_vm_sizes()

List the supported VM sizes in a region

delete_compute()

Delete a cluster

Working with data

Functions for accessing your data in Azure Storage services. A Datastore is attached to a workspace and is used to store connection information to an Azure storage service.

upload_files_to_datastore()

Upload files to the Azure storage a datastore points to

upload_to_datastore()

Upload a local directory to the Azure storage a datastore points to

download_from_datastore()

Download data from a datastore to the local file system

get_datastore()

Get an existing datastore

register_azure_blob_container_datastore()

Register an Azure blob container as a datastore

register_azure_file_share_datastore()

Register an Azure file share as a datastore

register_azure_sql_database_datastore()

Initialize a new Azure SQL database Datastore.

register_azure_postgre_sql_datastore()

Initialize a new Azure PostgreSQL Datastore.

register_azure_data_lake_gen2_datastore()

Initialize a new Azure Data Lake Gen2 Datastore.

unregister_datastore()

Unregister a datastore from its associated workspace

Working with datasets

Functions for managing datasets. An Azure Machine Learning Dataset allows you to interact with data in your datastores and package your data into a consumable object for machine learning tasks. Datasets can be created from local files, public urls, or specific file(s) in your datastores. Azure ML supports Dataset types of FileDataset and TabularDataset.

register_dataset()

Register a Dataset in the workspace

unregister_all_dataset_versions()

Unregister all versions under the registration name of this dataset from the workspace.

get_dataset_by_name()

Get a registered Dataset from the workspace by its registration name.

get_dataset_by_id()

Get Dataset by ID.

get_input_dataset_from_run()

Return the named list for input datasets.

create_file_dataset_from_files()

Create a FileDataset to represent file streams.

get_file_dataset_paths()

Get a list of file paths for each file stream defined by the dataset.

download_from_file_dataset()

Download file streams defined by the dataset as local files.

mount_file_dataset()

Create a context manager for mounting file streams defined by the dataset as local files.

skip_from_dataset()

Skip file streams from the top of the dataset by the specified count.

take_from_dataset()

Take a sample of file streams from top of the dataset by the specified count.

take_sample_from_dataset()

Take a random sample of file streams in the dataset approximately by the probability specified.

random_split_dataset()

Split file streams in the dataset into two parts randomly and approximately by the percentage specified.

create_tabular_dataset_from_parquet_files()

Create an unregistered, in-memory Dataset from parquet files.

create_tabular_dataset_from_delimited_files()

Create an unregistered, in-memory Dataset from delimited files.

create_tabular_dataset_from_json_lines_files()

Create a TabularDataset to represent tabular data in JSON Lines files (http://jsonlines.org/).

create_tabular_dataset_from_sql_query()

Create a TabularDataset to represent tabular data in SQL databases.

drop_columns_from_dataset()

Drop the specified columns from the dataset.

keep_columns_from_dataset()

Keep the specified columns and drops all others from the dataset.

filter_dataset_after_time()

Filter Tabular Dataset with time stamp columns after a specified start time.

filter_dataset_before_time()

Filter Tabular Dataset with time stamp columns before a specified end time.

filter_dataset_between_time()

Filter Tabular Dataset between a specified start and end time.

filter_dataset_from_recent_time()

Filter Tabular Dataset to contain only the specified duration (amount) of recent data.

define_timestamp_columns_for_dataset()

Define timestamp columns for the dataset.

load_dataset_into_data_frame()

Load all records from the dataset into a dataframe.

convert_to_dataset_with_csv_files()

Convert the current dataset into a FileDataset containing CSV files.

convert_to_dataset_with_parquet_files()

Convert the current dataset into a FileDataset containing Parquet files.

data_type_bool()

Configure conversion to bool.

data_type_datetime()

Configure conversion to datetime.

data_type_double()

Configure conversion to 53-bit double.

data_type_long()

Configure conversion to 64-bit integer.

data_type_string()

Configure conversion to string.

promote_headers_behavior()

Defines options for how column headers are processed when reading data from files to create a dataset.

data_path()

Represents a path to data in a datastore.

dataset_consumption_config()

Represent how to deliver the dataset to a compute target.

Environments

Functions for managing environments. An Azure Machine Learning Environment allows you to create, manage, and reuse the software dependencies required for training and deployment. Environments specify the R packages, environment variables, and software settings around your training and scoring scripts for your containerized training runs and deployments. They are managed and versioned entities within your Azure ML workspace that enable reproducible, auditable, and portable machine learning workflows across different compute targets. For more details, see r_environment().

r_environment()

Create an environment

cran_package()

Specifies a CRAN package to install in environment

github_package()

Specifies a Github package to install in environment

register_environment()

Register an environment in the workspace

get_environment()

Get an existing environment

container_registry()

Specify Azure Container Registry details

Training & experimentation

Functions for managing experiments and runs. An Experiment is a grouping of the collection of runs from a specified script. A Run represents a single trial of an experiment. A run is the object used to monitor the asynchronous execution of a trial, log metrics and store output of the trial, and to analyze results and access artifacts generated by the trial. The following run types are supported - ScriptRun (for Estimator experiments) and HyperDriveRun (for HyperDrive experiments). For functions that are specific only to HyperDriveRuns, see the Hyperparameter tuning reference sections. An Estimator wraps run configuration information for specifying details of executing an R script. Running an Estimator experiment (using submit_experiment()) will return a ScriptRun object and execute your training script on the specified compute target.

experiment()

Create an Azure Machine Learning experiment

get_runs_in_experiment()

Return a generator of the runs for an experiment

submit_experiment()

Submit an experiment and return the active created run

wait_for_run_completion()

Wait for the completion of a run

estimator()

Create an estimator

start_logging_run()

Create an interactive logging run

complete_run()

Mark a run as completed.

get_current_run()

Get the context object for a run

get_run()

Get an experiment run

get_run_details()

Get the details of a run

get_run_details_with_logs()

Get the details of a run along with the log files' contents

get_run_metrics()

Get the metrics logged to a run

get_secrets_from_run()

Get secrets from the keyvault associated with a run's workspace

cancel_run()

Cancel a run

get_run_file_names()

List the files that are stored in association with a run

download_file_from_run()

Download a file from a run

download_files_from_run()

Download files from a run

upload_files_to_run()

Upload files to a run

upload_folder_to_run()

Upload a folder to a run

log_metric_to_run()

Log a metric to a run

log_accuracy_table_to_run()

Log an accuracy table metric to a run

log_confusion_matrix_to_run()

Log a confusion matrix metric to a run

log_image_to_run()

Log an image metric to a run

log_list_to_run()

Log a vector metric value to a run

log_predictions_to_run()

Log a predictions metric to a run

log_residuals_to_run()

Log a residuals metric to a run

log_row_to_run()

Log a row metric to a run

log_table_to_run()

Log a table metric to a run

plot_run_details()

Generate table of run details

cran_package()

Specifies a CRAN package to install in environment

Hyperparameter tuning

Functions for configuring and managing hyperparameter tuning (HyperDrive) experiments. Azure ML’s HyperDrive functionality enables you to automate hyperparameter tuning of your machine learning models. For example, you can define the parameter search space as discrete or continuous, and a sampling method over the search space as random, grid, or Bayesian. Also, you can specify a primary metric to optimize in the hyperparameter tuning experiment, and whether to minimize or maximize that metric. You can also define early termination policies in which poorly performing experiment runs are canceled and new ones started.

hyperdrive_config()

Create a configuration for a HyperDrive run

random_parameter_sampling()

Define random sampling over a hyperparameter search space

grid_parameter_sampling()

Define grid sampling over a hyperparameter search space

bayesian_parameter_sampling()

Define Bayesian sampling over a hyperparameter search space

choice()

Specify a discrete set of options to sample from

randint()

Specify a set of random integers in the range [0, upper)

uniform()

Specify a uniform distribution of options to sample from

quniform()

Specify a uniform distribution of the form round(uniform(min_value, max_value) / q) * q

loguniform()

Specify a log uniform distribution

qloguniform()

Specify a uniform distribution of the form round(exp(uniform(min_value, max_value) / q) * q

normal()

Specify a real value that is normally-distributed with mean mu and standard deviation sigma

qnormal()

Specify a normal distribution of the form round(normal(mu, sigma) / q) * q

lognormal()

Specify a normal distribution of the form exp(normal(mu, sigma))

qlognormal()

Specify a normal distribution of the form round(exp(normal(mu, sigma)) / q) * q

primary_metric_goal()

Define supported metric goals for hyperparameter tuning

bandit_policy()

Define a Bandit policy for early termination of HyperDrive runs

median_stopping_policy()

Define a median stopping policy for early termination of HyperDrive runs

truncation_selection_policy()

Define a truncation selection policy for early termination of HyperDrive runs

get_best_run_by_primary_metric()

Return the best performing run amongst all completed runs

get_child_runs_sorted_by_primary_metric()

Get the child runs sorted in descending order by best primary metric

get_child_run_hyperparameters()

Get the hyperparameters for all child runs

get_child_run_metrics()

Get the metrics from all child runs

Model management & deployment

Functions for model management and deployment. Registering a model allows you to store and version your trained model in a workspace. A registered Model can then be deployed as a Webservice using Azure ML. If you would like to access all the assets needed to host a model as a web service without actually deploying the model, you can do so by packaging the model as a ModelPackage. You can deploy your model as a LocalWebservice (locally), AciWebservice (on Azure Container Instances), or AksWebservice (on Azure Kubernetes Service).

get_model()

Get a registered model

register_model()

Register a model to a given workspace

register_model_from_run()

Register a model for operationalization.

download_model()

Download a model to the local file system

deploy_model()

Deploy a web service from registered model(s)

package_model()

Create a model package that packages all the assets needed to host a model as a web service

delete_model()

Delete a model from its associated workspace

get_model_package_container_registry()

Get the Azure container registry that a packaged model uses

get_model_package_creation_logs()

Get the model package creation logs

pull_model_package_image()

Pull the Docker image from a ModelPackage to your local Docker environment

save_model_package_files()

Save a Dockerfile and dependencies from a ModelPackage to your local file system

wait_for_model_package_creation()

Wait for a model package to finish creating

inference_config()

Create an inference configuration for model deployments

get_webservice()

Get a deployed web service

wait_for_deployment()

Wait for a web service to finish deploying

get_webservice_logs()

Retrieve the logs for a web service

get_webservice_keys()

Retrieve auth keys for a web service

generate_new_webservice_key()

Regenerate one of a web service's keys

get_webservice_token()

Retrieve the auth token for a web service

invoke_webservice()

Call a web service with the provided input

delete_webservice()

Delete a web service from a given workspace

aci_webservice_deployment_config()

Create a deployment config for deploying an ACI web service

update_aci_webservice()

Update a deployed ACI web service

aks_webservice_deployment_config()

Create a deployment config for deploying an AKS web service

update_aks_webservice()

Update a deployed AKS web service

local_webservice_deployment_config()

Create a deployment config for deploying a local web service

update_local_webservice()

Update a local web service

delete_local_webservice()

Delete a local web service from the local machine

reload_local_webservice_assets()

Reload a local web service's entry script and dependencies

resource_configuration()

Initialize the ResourceConfiguration.