composer.algorithms.stochastic_depth.stochastic_depth#

composer.algorithms.stochastic_depth.stochastic_depth

Functions

apply_stochastic_depth

Applies Stochastic Depth (Huang et al, 2016) to the specified model.

Classes

Algorithm

Base class for algorithms.

Bottleneck

torchvision.models.resnet.Bottleneck

Event

Enum to represent events in the training loop.

Logger

Logger routes metrics to the LoggerCallback.

SampleStochasticBottleneck

Sample-wise stochastic ResNet Bottleneck block.

State

The state of the trainer.

StochasticBottleneck

Stochastic ResNet Bottleneck block.

StochasticDepth

Applies Stochastic Depth (Huang et al, 2016) to the specified model.

Time

Time represents static durations of training time or points in the training process in terms of a TimeUnit enum (epochs, batches, samples, tokens, or duration).

TimeUnit

Enum class to represent units of time for the training process.

Attributes

  • Optimizers

  • Optional

  • Type

  • Union

  • annotations

  • log

class composer.algorithms.stochastic_depth.stochastic_depth.StochasticDepth(target_layer_name, stochastic_method='block', drop_rate=0.2, drop_distribution='linear', drop_warmup=0.0, use_same_gpu_seed=True)[source]#

Bases: composer.core.algorithm.Algorithm

Applies Stochastic Depth (Huang et al, 2016) to the specified model.

The algorithm replaces the specified target layer with a stochastic version of the layer. The stochastic layer will randomly drop either samples or the layer itself depending on the stochastic method specified. The block-wise version follows the original paper. The sample-wise version follows the implementation used for EfficientNet in the Tensorflow/TPU repo.

Runs on INIT, as well as BATCH_START if drop_warmup > 0.

Note

Stochastic Depth only works on instances of torchvision.models.resnet.ResNet for now.

Parameters
  • target_layer_name (str) โ€“ Block to replace with a stochastic block equivalent. The name must be registered in STOCHASTIC_LAYER_MAPPING dictionary with the target layer class and the stochastic layer class. Currently, only torchvision.models.resnet.Bottleneck is supported.

  • stochastic_method (str, optional) โ€“ The version of stochastic depth to use. "block" randomly drops blocks during training. "sample" randomly drops samples within a block during training. Default: "block".

  • drop_rate (float, optional) โ€“ The base probability of dropping a layer or sample. Must be between 0.0 and 1.0. Default: 0.2.

  • drop_distribution (str, optional) โ€“ How drop_rate is distributed across layers. Value must be one of "uniform" or "linear". "uniform" assigns the same drop_rate across all layers. "linear" linearly increases the drop rate across layer depth starting with 0 drop rate and ending with drop_rate. Default: "linear".

  • drop_warmup (str | Time | float, optional) โ€“ A Time object, time-string, or float on [0.0; 1.0] representing the fraction of the training duration to linearly increase the drop probability to linear_drop_rate. Default: 0.0.

  • use_same_gpu_seed (bool, optional) โ€“ Set to True to have the same layers dropped across GPUs when using multi-GPU training. Set to False to have each GPU drop a different set of layers. Only used with "block" stochastic method. Default: True.

apply(event, state, logger)[source]#

Applies StochasticDepth modification to the stateโ€™s model.

Parameters
  • event (Event) โ€“ the current event

  • state (State) โ€“ the current trainer state

  • logger (Logger) โ€“ the training logger

property find_unused_parameters#

DDP parameter to notify that parameters may not have gradients if it is dropped during the forward pass.

match(event, state)[source]#

Run on INIT, as well as BATCH_START if drop_warmup > 0.

Args:

event (Event): The current event. state (State): The current state.

Returns:

bool: True if this algorithm should run now.

composer.algorithms.stochastic_depth.stochastic_depth.apply_stochastic_depth(model, target_layer_name, stochastic_method='block', drop_rate=0.2, drop_distribution='linear', use_same_gpu_seed=True, optimizers=None)[source]#

Applies Stochastic Depth (Huang et al, 2016) to the specified model.

The algorithm replaces the specified target layer with a stochastic version of the layer. The stochastic layer will randomly drop either samples or the layer itself depending on the stochastic method specified. The block-wise version follows the original paper. The sample-wise version follows the implementation used for EfficientNet in the Tensorflow/TPU repo.

Note

Stochastic Depth only works on instances of torchvision.models.resnet.ResNet for now.

Parameters
  • model (Module) โ€“ model containing modules to be replaced with stochastic versions

  • target_layer_name (str) โ€“ Block to replace with a stochastic block equivalent. The name must be registered in STOCHASTIC_LAYER_MAPPING dictionary with the target layer class and the stochastic layer class. Currently, only torchvision.models.resnet.Bottleneck is supported.

  • stochastic_method (str, optional) โ€“ The version of stochastic depth to use. "block" randomly drops blocks during training. "sample" randomly drops samples within a block during training. Default: "block".

  • drop_rate (float, optional) โ€“ The base probability of dropping a layer or sample. Must be between 0.0 and 1.0. Default: 0.2`.

  • drop_distribution (str, optional) โ€“ How drop_rate is distributed across layers. Value must be one of "uniform" or "linear". "uniform" assigns the same drop_rate across all layers. "linear" linearly increases the drop rate across layer depth starting with 0 drop rate and ending with drop_rate. Default: "linear".

  • use_same_gpu_seed (bool, optional) โ€“ Set to True to have the same layers dropped across GPUs when using multi-GPU training. Set to False to have each GPU drop a different set of layers. Only used with "block" stochastic method. Default: True.

  • optimizers (Optimizers, optional) โ€“

    Existing optimizers bound to model.parameters(). All optimizers that have already been constructed with model.parameters() must be specified here so they will optimize the correct parameters.

    If the optimizer(s) are constructed after calling this function, then it is safe to omit this parameter. These optimizers will see the correct model parameters.

Returns

The modified model

Example

import composer.functional as cf
from torchvision import models
model = models.resnet50()
cf.apply_stochastic_depth(model, target_layer_name='ResNetBottleneck')