Interface BatchAIJob

All Superinterfaces:
HasId, HasInner<com.microsoft.azure.management.batchai.implementation.JobInner>, HasName, Indexable, Refreshable<BatchAIJob>

@Beta(V1_6_0) public interface BatchAIJob extends HasInner<com.microsoft.azure.management.batchai.implementation.JobInner>, Indexable, HasId, HasName, Refreshable<BatchAIJob>
Client-side representation of Batch AI Job object, associated with Batch AI Cluster.
  • Method Details

    • terminate

      void terminate()
      Terminates a job.
    • terminateAsync

      rx.Completable terminateAsync()
      Terminates a job.
      Returns:
      a representation of the deferred computation of this call
    • listFiles

      com.microsoft.azure.PagedList<OutputFile> listFiles(String outputDirectoryId)
      List all files inside the given output directory (Only if the output directory is on Azure File Share or Azure Storage container).
      Parameters:
      outputDirectoryId - Id of the job output directory. This is the OutputDirectory-->id parameter that is given by the user during Create Job.
      Returns:
      list of files inside the given output directory
    • listFilesAsync

      rx.Observable<OutputFile> listFilesAsync(String outputDirectoryId)
      List all files inside the given output directory (Only if the output directory is on Azure File Share or Azure Storage container).
      Parameters:
      outputDirectoryId - Id of the job output directory. This is the OutputDirectory-->id parameter that is given by the user during Create Job.
      Returns:
      an observable that emits output file information
    • listFiles

      com.microsoft.azure.PagedList<OutputFile> listFiles(String outputDirectoryId, String directory, Integer linkExpiryMinutes, Integer maxResults)
      List all files inside the given output directory (Only if the output directory is on Azure File Share or Azure Storage container).
      Parameters:
      outputDirectoryId - Id of the job output directory. This is the OutputDirectory-->id parameter that is given by the user during Create Job.
      directory - the path to the directory
      linkExpiryMinutes - the number of minutes after which the download link will expire
      maxResults - the maximum number of items to return in the response. A maximum of 1000 files can be returned
      Returns:
      list of files inside the given output directory
    • listFilesAsync

      rx.Observable<OutputFile> listFilesAsync(String outputDirectoryId, String directory, Integer linkExpiryMinutes, Integer maxResults)
      List all files inside the given output directory (Only if the output directory is on Azure File Share or Azure Storage container).
      Parameters:
      outputDirectoryId - Id of the job output directory. This is the OutputDirectory-->id parameter that is given by the user during Create Job.
      directory - the path to the directory
      linkExpiryMinutes - the number of minutes after which the download link will expire
      maxResults - the maximum number of items to return in the response. A maximum of 1000 files can be returned
      Returns:
      an observable that emits output file information
    • listRemoteLoginInformation

      com.microsoft.azure.PagedList<RemoteLoginInformation> listRemoteLoginInformation()
      Gets a list of currently existing nodes which were used for the Job execution. The returned information contains the node ID, its public IP and SSH port.
      Returns:
      list of remote login details
    • listRemoteLoginInformationAsync

      rx.Observable<RemoteLoginInformation> listRemoteLoginInformationAsync()
      Gets a list of currently existing nodes which were used for the Job execution. The returned information contains the node ID, its public IP and SSH port.
      Returns:
      an observable that emits remote login information
    • schedulingPriority

      JobPriority schedulingPriority()
      Returns:
      priority associated with the job. Priority values can range from -1000 to 1000, with -1000 being the lowest priority and 1000 being the highest priority. The default value is 0.
    • cluster

      ResourceId cluster()
      Returns:
      the Id of the cluster on which this job will run.
    • mountVolumes

      MountVolumes mountVolumes()
      Returns:
      information on mount volumes to be used by the job. These volumes will be mounted before the job execution and will be unmouted after the job completion. The volumes will be mounted at location specified by $AZ_BATCHAI_JOB_MOUNT_ROOT environment variable.
    • jobOutputDirectoryPathSegment

      String jobOutputDirectoryPathSegment()
      Returns:
      a segment of job's output directories path created by BatchAI. Batch AI creates job's output directories under an unique path to avoid conflicts between jobs. This value contains a path segment generated by Batch AI to make the path unique and can be used to find the output directory on the node or mounted filesystem.
    • nodeCount

      int nodeCount()
      Returns:
      number of compute nodes to run the job on. The job will be gang scheduled on that many compute nodes.
    • containerSettings

      ContainerSettings containerSettings()
      Returns:
      the settings for the container to run the job. If not provided, the job will run on the VM.
    • toolType

      ToolType toolType()
      Returns:
      The toolkit type of this job
    • cntkSettings

      CNTKsettings cntkSettings()
      Returns:
      the settings for CNTK (aka Microsoft Cognitive Toolkit) job
    • pyTorchSettings

      PyTorchSettings pyTorchSettings()
      Returns:
      the settings for pyTorch job
    • tensorFlowSettings

      TensorFlowSettings tensorFlowSettings()
      Returns:
      the settings for Tensor Flow job
    • caffeSettings

      CaffeSettings caffeSettings()
      Returns:
      the settings for Caffe job.
    • chainerSettings

      ChainerSettings chainerSettings()
      Returns:
      the settings for Chainer job.
    • customToolkitSettings

      CustomToolkitSettings customToolkitSettings()
      Returns:
      the settings for custom tool kit job
    • jobPreparation

      JobPreparation jobPreparation()
      Returns:
      the actions to be performed before tool kit is launched. The specified actions will run on all the nodes that are part of the job.
    • stdOutErrPathPrefix

      String stdOutErrPathPrefix()
      Returns:
      the path where the Batch AI service will upload stdout and stderror of the job.
    • inputDirectories

      List<InputDirectory> inputDirectories()
      Returns:
      the list of input directories for the Job
    • outputDirectories

      List<OutputDirectory> outputDirectories()
      Returns:
      the list of output directories where the models will be created
    • environmentVariables

      List<EnvironmentVariable> environmentVariables()
      Returns:
      Additional environment variables to be passed to the job. Batch AI services sets the following environment variables for all jobs: AZ_BATCHAI_INPUT_id, AZ_BATCHAI_OUTPUT_id, AZ_BATCHAI_NUM_GPUS_PER_NODE, For distributed TensorFlow jobs, following additional environment variables are set by the Batch AI Service: AZ_BATCHAI_PS_HOSTS, AZ_BATCHAI_WORKER_HOSTS.
    • secrets

      Returns:
      environment variables with secret values to set on the job. Only names are reported, server will never report values of these variables back.
    • constraints

      Returns:
      constraints associated with the Job.
    • creationTime

      org.joda.time.DateTime creationTime()
      Returns:
      the creation time of the job
    • provisioningState

      ProvisioningState provisioningState()
      Returns:
      the provisioned state of the Batch AI job
    • provisioningStateTransitionTime

      org.joda.time.DateTime provisioningStateTransitionTime()
      The time at which the job entered its current provisioning state.
      Returns:
      the time at which the job entered its current provisioning state
    • executionState

      ExecutionState executionState()
      Gets the current state of the job. Possible values are: queued - The job is queued and able to run. A job enters this state when it is created, or when it is awaiting a retry after a failed run. running - The job is running on a compute cluster. This includes job-level preparation such as downloading resource files or set up container specified on the job - it does not necessarily mean that the job command line has started executing. terminating - The job is terminated by the user, the terminate operation is in progress. succeeded - The job has completed running succesfully and exited with exit code 0. failed - The job has finished unsuccessfully (failed with a non-zero exit code) and has exhausted its retry limit. A job is also marked as failed if an error occurred launching the job.
      Returns:
      the current state of the job
    • executionStateTransitionTime

      org.joda.time.DateTime executionStateTransitionTime()
      Returns:
      the time at which the job entered its current execution state
    • executionInfo

      Returns:
      information about the execution of a job in the Azure Batch service.
    • experiment

      BatchAIExperiment experiment()
      Returns:
      the experiment information of the job.