Skip to main content

Running Code in the Cloud

Experiments and Runs#

Azure ML is a machine-learning service that facilitates running your code in the cloud. A Run is an abstraction layer around each such submission, and is used to monitor the job in real time as well as keep a history of your results.

  • Run: A run represents a single execution of your code. See Run for more details.
  • Experiments: An experiment is a light-weight container for Run. Use experiments to submit and track runs.

Create an experiment in your workspace ws.

from azureml.core import Experiment
exp = Experiment(ws, '<experiment-name>')

ScriptRunConfig#

A common way to run code in the cloud is via the ScriptRunConfig which packages your source code (Script) and run configuration (RunConfig).

Consider the following layout for your code.

source_directory/
script.py # entry point to your code
module1.py # modules called by script.py
...

To run script.py in the cloud via the ScriptRunConfig

config = ScriptRunConfig(
source_directory='<path/to/source_directory>',
script='script.py',
compute_target=target,
environment=env,
arguments = [
'--learning_rate', 0.001,
'--momentum', 0.9,
]
)

where:

  • source_directory='source_directory' : Local directory with your code.
  • script='script.py' : Script to run. This does not need to be at the root of source_directory.
  • compute_taget=target : See Compute Target
  • environment : See Environment
  • arguments : See Arguments

Submit this code to Azure with

exp = Experiment(ws, '<exp-name>')
run = exp.submit(config)
print(run)
run.wait_for_completion(show_output=True)

This will present you with a link to monitor your run on the web (https://ml.azure.com) as well as streaming logs to your terminal.

Command Line Arguments#

To pass command line arguments to your script use the arguments parameter in ScriptRunConfig. Arguments are specified as a list:

arguments = [first, second, third, ...]

which are then passed to the script as command-line arguments as follows:

$ python script.py first second third ...

This also supports using named arguments:

arguments = ['--first_arg', first_val, '--second_arg', second_val, ...]

Arguments can be of type int, float str and can also be used to reference data.

For more details on referencing data via the command line: Use dataset in a remote run

Example: sys.argv#

In this example we pass two arguments to our script. If we were running this from the console:

console
$ python script.py 0.001 0.9

To mimic this command using argument in ScriptRunConfig:

run.py
arguments = [0.001, 0.9]
config = ScriptRunConfig(
source_directory='.',
script='script.py',
arguments=arguments,
)

which can be consumed as usual in our script:

script.py
import sys
learning_rate = sys.argv[1] # gets 0.001
momentum = sys.argv[2] # gets 0.9

Example: argparse#

In this example we pass two named arguments to our script. If we were running this from the console:

console
$ python script.py --learning_rate 0.001 --momentum 0.9

To mimic this behavior in ScriptRunConfig:

run.py
arguments = [
'--learning_rate', 0.001,
'--momentum', 0.9,
]
config = ScriptRunConfig(
source_directory='.',
script='script.py',
arguments=arguments,
)

which can be consumed as usual in our script:

script.py
import argparse
parser = argparse.Argparser()
parser.add_argument('--learning_rate', type=float)
parser.add_argument('--momentum', type=float)
args = parser.parse_args()
learning_rate = args.learning_rate # gets 0.001
momentum = args.momentum # gets 0.9

Commands#

It is possible to provide the explicit command to run.

command = 'python script.py'.split()
config = ScriptRunConfig(
source_directory='<path/to/code>',
command=command,
compute_target=compute_target,
environment=environment,
)

This example is equivalent to setting the argument script='script.py' in place of the command argument.

This option provides a lot of flexibility. For example:

  • Set environment variables: Some useful examples:

    command = 'export PYTHONPATH=$PWD && python script.py'.split()
    command = f'export RANK={rank} && python script.py'.split()
  • Run setup script: Run a setup script e.g. to download data, set environment variables.

    command = 'python setup.py && python script.py'.split()

Using Datasets#

via Arguments#

Pass a dataset to your ScriptRunConfig as an argument

# create dataset
datastore = ws.get_default_datastore()
dataset = Dataset.File.from_files(path=(datastore, '<path/on/datastore>'))
arguments = ['--dataset', dataset.as_mount()]
config = ScriptRunConfig(
source_directory='.',
script='script.py',
arguments=arguments,
)

This mounts the dataset to the run where it can be referenced by script.py.

Run#

Interactive#

In an interactive setting e.g. a Jupyter notebook

run = exp.start_logging()

Example: Jupyter notebook#

A common use case for interactive logging is to train a model in a notebook.

from azureml.core import Workspace
from azureml.core import Experiment
ws = Workspace.from_config()
exp = Experiment(ws, 'example')
run = exp.start_logging() # start interactive run
print(run.get_portal_url()) # get link to studio
# toy example in place of e.g. model
# training or exploratory data analysis
import numpy as np
for x in np.linspace(0, 10):
y = np.sin(x)
run.log_row('sine', x=x, y=y) # log metrics
run.complete() # stop interactive run

Follow the link to the run to see the metric logging in real time.

Get Context#

Code that is running within Azure ML is associated to a Run. The submitted code can access its own run.

from azureml.core import Run
run = Run.get_context()

Example: Logging metrics to current run context#

A common use-case is logging metrics in a training script.

train.py
from azureml.core import Run
run = Run.get_context()
# training code
for epoch in range(n_epochs):
model.train()
...
val = model.evaluate()
run.log('validation', val)

When this code is submitted to Azure ML (e.g. via ScriptRunConfig) it will log metrics to its associated run.

For more details: Logging Metrics