`R/hyperdrive.R`

`bandit_policy.Rd`

Bandit is an early termination policy based on slack factor/slack amount and evaluation interval. The policy early terminates any runs where the primary metric is not within the specified slack factor/slack amount with respect to the best performing training run.

bandit_policy( slack_factor = NULL, slack_amount = NULL, evaluation_interval = 1L, delay_evaluation = 0L )

slack_factor | A double of the ratio of the allowed distance from the best performing run. |
---|---|

slack_amount | A double of the absolute distance allowed from the best performing run. |

evaluation_interval | An integer of the frequency for applying policy. |

delay_evaluation | An integer of the number of intervals for which to delay the first evaluation. |

The `BanditPolicy`

object.

The Bandit policy takes the following configuration parameters:

`slack_factor`

or`slack_amount`

: The slack allowed with respect to the best performing training run.`slack_factor`

specifies the allowable slack as a ration.`slack_amount`

specifies the allowable slack as an absolute amount, instead of a ratio.`evaluation_interval`

: Optional. The frequency for applying the policy. Each time the training script logs the primary metric counts as one interval.`delay_evaluation`

: Optional. The number of intervals to delay the policy evaluation. Use this parameter to avoid premature termination of training runs. If specified, the policy applies every multiple of`evaluation_interval`

that is greater than or equal to`delay_evaluation`

.

Any run that doesn't fall within the slack factor or slack amount of the evaluation metric with respect to the best performing run will be terminated.

Consider a Bandit policy with `slack_factor = 0.2`

and
`evaluation_interval = 100`

. Assume that run X is the currently best
performing run with an AUC (performance metric) of 0.8 after 100 intervals.
Further, assume the best AUC reported for a run is Y. This policy compares
the value `(Y + Y * 0.2)`

to 0.8, and if smaller, cancels the run.
If `delay_evaluation = 200`

, then the first time the policy will be applied
is at interval 200.

Now, consider a Bandit policy with `slack_amount = 0.2`

and
`evaluation_interval = 100`

. If run 3 is the currently best performing run
with an AUC (performance metric) of 0.8 after 100 intervals, then any run
with an AUC less than 0.6 (`0.8 - 0.2`

) after 100 iterations will be
terminated. Similarly, the `delay_evaluation`

can also be used to delay the
first termination policy evaluation for a specific number of sequences.

# In this example, the early termination policy is applied at every interval # when metrics are reported, starting at evaluation interval 5. Any run whose # best metric is less than (1 / (1 + 0.1)) or 91\% of the best performing run will # be terminated if (FALSE) { early_termination_policy = bandit_policy(slack_factor = 0.1, evaluation_interval = 1L, delay_evaluation = 5L) }