Interface BatchAIJob
- All Superinterfaces:
HasId
,HasInner<com.microsoft.azure.management.batchai.implementation.JobInner>
,HasName
,Indexable
,Refreshable<BatchAIJob>
@Beta(V1_6_0)
public interface BatchAIJob
extends HasInner<com.microsoft.azure.management.batchai.implementation.JobInner>, Indexable, HasId, HasName, Refreshable<BatchAIJob>
Client-side representation of Batch AI Job object, associated with Batch AI Cluster.
-
Nested Class Summary
Modifier and TypeInterfaceDescriptionstatic interface
The entirety of the Batch AI job definition.static interface
Grouping of Batch AI job definition stages. -
Method Summary
Modifier and TypeMethodDescriptioncluster()
org.joda.time.DateTime
Gets the current state of the job.org.joda.time.DateTime
com.microsoft.azure.PagedList<OutputFile>
List all files inside the given output directory (Only if the output directory is on Azure File Share or Azure Storage container).com.microsoft.azure.PagedList<OutputFile>
listFiles
(String outputDirectoryId, String directory, Integer linkExpiryMinutes, Integer maxResults) List all files inside the given output directory (Only if the output directory is on Azure File Share or Azure Storage container).rx.Observable<OutputFile>
listFilesAsync
(String outputDirectoryId) List all files inside the given output directory (Only if the output directory is on Azure File Share or Azure Storage container).rx.Observable<OutputFile>
listFilesAsync
(String outputDirectoryId, String directory, Integer linkExpiryMinutes, Integer maxResults) List all files inside the given output directory (Only if the output directory is on Azure File Share or Azure Storage container).com.microsoft.azure.PagedList<RemoteLoginInformation>
Gets a list of currently existing nodes which were used for the Job execution.rx.Observable<RemoteLoginInformation>
Gets a list of currently existing nodes which were used for the Job execution.int
org.joda.time.DateTime
The time at which the job entered its current provisioning state.secrets()
void
Terminates a job.rx.Completable
Terminates a job.toolType()
Methods inherited from interface com.microsoft.azure.management.resources.fluentcore.arm.models.HasId
id
Methods inherited from interface com.microsoft.azure.management.resources.fluentcore.model.HasInner
inner
Methods inherited from interface com.microsoft.azure.management.resources.fluentcore.arm.models.HasName
name
Methods inherited from interface com.microsoft.azure.management.resources.fluentcore.model.Indexable
key
Methods inherited from interface com.microsoft.azure.management.resources.fluentcore.model.Refreshable
refresh, refreshAsync
-
Method Details
-
terminate
void terminate()Terminates a job. -
terminateAsync
rx.Completable terminateAsync()Terminates a job.- Returns:
- a representation of the deferred computation of this call
-
listFiles
List all files inside the given output directory (Only if the output directory is on Azure File Share or Azure Storage container).- Parameters:
outputDirectoryId
- Id of the job output directory. This is the OutputDirectory-->id parameter that is given by the user during Create Job.- Returns:
- list of files inside the given output directory
-
listFilesAsync
List all files inside the given output directory (Only if the output directory is on Azure File Share or Azure Storage container).- Parameters:
outputDirectoryId
- Id of the job output directory. This is the OutputDirectory-->id parameter that is given by the user during Create Job.- Returns:
- an observable that emits output file information
-
listFiles
com.microsoft.azure.PagedList<OutputFile> listFiles(String outputDirectoryId, String directory, Integer linkExpiryMinutes, Integer maxResults) List all files inside the given output directory (Only if the output directory is on Azure File Share or Azure Storage container).- Parameters:
outputDirectoryId
- Id of the job output directory. This is the OutputDirectory-->id parameter that is given by the user during Create Job.directory
- the path to the directorylinkExpiryMinutes
- the number of minutes after which the download link will expiremaxResults
- the maximum number of items to return in the response. A maximum of 1000 files can be returned- Returns:
- list of files inside the given output directory
-
listFilesAsync
rx.Observable<OutputFile> listFilesAsync(String outputDirectoryId, String directory, Integer linkExpiryMinutes, Integer maxResults) List all files inside the given output directory (Only if the output directory is on Azure File Share or Azure Storage container).- Parameters:
outputDirectoryId
- Id of the job output directory. This is the OutputDirectory-->id parameter that is given by the user during Create Job.directory
- the path to the directorylinkExpiryMinutes
- the number of minutes after which the download link will expiremaxResults
- the maximum number of items to return in the response. A maximum of 1000 files can be returned- Returns:
- an observable that emits output file information
-
listRemoteLoginInformation
com.microsoft.azure.PagedList<RemoteLoginInformation> listRemoteLoginInformation()Gets a list of currently existing nodes which were used for the Job execution. The returned information contains the node ID, its public IP and SSH port.- Returns:
- list of remote login details
-
listRemoteLoginInformationAsync
rx.Observable<RemoteLoginInformation> listRemoteLoginInformationAsync()Gets a list of currently existing nodes which were used for the Job execution. The returned information contains the node ID, its public IP and SSH port.- Returns:
- an observable that emits remote login information
-
schedulingPriority
JobPriority schedulingPriority()- Returns:
- priority associated with the job. Priority values can range from -1000 to 1000, with -1000 being the lowest priority and 1000 being the highest priority. The default value is 0.
-
cluster
ResourceId cluster()- Returns:
- the Id of the cluster on which this job will run.
-
mountVolumes
MountVolumes mountVolumes()- Returns:
- information on mount volumes to be used by the job. These volumes will be mounted before the job execution and will be unmouted after the job completion. The volumes will be mounted at location specified by $AZ_BATCHAI_JOB_MOUNT_ROOT environment variable.
-
jobOutputDirectoryPathSegment
String jobOutputDirectoryPathSegment()- Returns:
- a segment of job's output directories path created by BatchAI. Batch AI creates job's output directories under an unique path to avoid conflicts between jobs. This value contains a path segment generated by Batch AI to make the path unique and can be used to find the output directory on the node or mounted filesystem.
-
nodeCount
int nodeCount()- Returns:
- number of compute nodes to run the job on. The job will be gang scheduled on that many compute nodes.
-
containerSettings
ContainerSettings containerSettings()- Returns:
- the settings for the container to run the job. If not provided, the job will run on the VM.
-
toolType
ToolType toolType()- Returns:
- The toolkit type of this job
-
cntkSettings
CNTKsettings cntkSettings()- Returns:
- the settings for CNTK (aka Microsoft Cognitive Toolkit) job
-
pyTorchSettings
PyTorchSettings pyTorchSettings()- Returns:
- the settings for pyTorch job
-
tensorFlowSettings
TensorFlowSettings tensorFlowSettings()- Returns:
- the settings for Tensor Flow job
-
caffeSettings
CaffeSettings caffeSettings()- Returns:
- the settings for Caffe job.
-
chainerSettings
ChainerSettings chainerSettings()- Returns:
- the settings for Chainer job.
-
customToolkitSettings
CustomToolkitSettings customToolkitSettings()- Returns:
- the settings for custom tool kit job
-
jobPreparation
JobPreparation jobPreparation()- Returns:
- the actions to be performed before tool kit is launched. The specified actions will run on all the nodes that are part of the job.
-
stdOutErrPathPrefix
String stdOutErrPathPrefix()- Returns:
- the path where the Batch AI service will upload stdout and stderror of the job.
-
inputDirectories
List<InputDirectory> inputDirectories()- Returns:
- the list of input directories for the Job
-
outputDirectories
List<OutputDirectory> outputDirectories()- Returns:
- the list of output directories where the models will be created
-
environmentVariables
List<EnvironmentVariable> environmentVariables()- Returns:
- Additional environment variables to be passed to the job. Batch AI services sets the following environment variables for all jobs: AZ_BATCHAI_INPUT_id, AZ_BATCHAI_OUTPUT_id, AZ_BATCHAI_NUM_GPUS_PER_NODE, For distributed TensorFlow jobs, following additional environment variables are set by the Batch AI Service: AZ_BATCHAI_PS_HOSTS, AZ_BATCHAI_WORKER_HOSTS.
-
secrets
List<EnvironmentVariableWithSecretValue> secrets()- Returns:
- environment variables with secret values to set on the job. Only names are reported, server will never report values of these variables back.
-
constraints
JobPropertiesConstraints constraints()- Returns:
- constraints associated with the Job.
-
creationTime
org.joda.time.DateTime creationTime()- Returns:
- the creation time of the job
-
provisioningState
ProvisioningState provisioningState()- Returns:
- the provisioned state of the Batch AI job
-
provisioningStateTransitionTime
org.joda.time.DateTime provisioningStateTransitionTime()The time at which the job entered its current provisioning state.- Returns:
- the time at which the job entered its current provisioning state
-
executionState
ExecutionState executionState()Gets the current state of the job. Possible values are: queued - The job is queued and able to run. A job enters this state when it is created, or when it is awaiting a retry after a failed run. running - The job is running on a compute cluster. This includes job-level preparation such as downloading resource files or set up container specified on the job - it does not necessarily mean that the job command line has started executing. terminating - The job is terminated by the user, the terminate operation is in progress. succeeded - The job has completed running succesfully and exited with exit code 0. failed - The job has finished unsuccessfully (failed with a non-zero exit code) and has exhausted its retry limit. A job is also marked as failed if an error occurred launching the job.- Returns:
- the current state of the job
-
executionStateTransitionTime
org.joda.time.DateTime executionStateTransitionTime()- Returns:
- the time at which the job entered its current execution state
-
executionInfo
JobPropertiesExecutionInfo executionInfo()- Returns:
- information about the execution of a job in the Azure Batch service.
-
experiment
BatchAIExperiment experiment()- Returns:
- the experiment information of the job.
-