composer.algorithms#

Modules

`composer.algorithms.algorithm_hparams_registry`	composer.algorithms.algorithm_hparams_registry
`composer.algorithms.alibi`	ALiBi (Attention with Linear Biases; Press et al, 2021) dispenses with position embeddings for tokens in transformer-based NLP models, instead encoding position information by biasing the query-key attention scores proportionally to each token pair's distance.
`composer.algorithms.augmix`	AugMix (Hendrycks et al, 2020) creates multiple independent realizations of sequences of image augmentations, applies each sequence with random intensity, and returns a convex combination of the augmented images and the original image.
`composer.algorithms.blurpool`	BlurPool adds anti-aliasing filters to convolutional layers to increase accuracy and invariance to small shifts in the input.
`composer.algorithms.channels_last`	Changes the memory format of the model to `torch.channels_last`.
`composer.algorithms.colout`	Drops a fraction of the rows and columns of an input image.
`composer.algorithms.cutmix`	CutMix trains the network on non-overlapping combinations of pairs of examples and iterpolated targets rather than individual examples and targets.
`composer.algorithms.cutout`	Cutout is a data augmentation technique that works by masking out one or more square regions of an input image.
`composer.algorithms.ema`	Exponential moving average maintains a moving average of model parameters and uses these at test time.
`composer.algorithms.factorize`	Decomposes linear operators into pairs of smaller linear operators.
`composer.algorithms.fused_layernorm`	Replaces all instances of torch.nn.LayerNorm with a apex.normalization.fused_layer_norm.FusedLayerNorm.
`composer.algorithms.gated_linear_units`	Replaces the Linear layers in the feed-forward network with Gated Linear Units.
`composer.algorithms.ghost_batchnorm`	Replaces batch normalization modules with Ghost Batch Normalization modules that simulate the effect of using a smaller batch size.
`composer.algorithms.gradient_clipping`	Clips all gradients in a model based on their values, their norms, and their parameters' norms.
`composer.algorithms.label_smoothing`	Shrinks targets towards a uniform distribution to counteract label noise.
`composer.algorithms.layer_freezing`	Progressively freeze the layers of the network during training, starting with the earlier layers.
`composer.algorithms.mixup`	Create new samples using convex combinations of pairs of samples.
`composer.algorithms.no_op_model`	Replaces model with a dummy model of type `NoOpModelClass`.
`composer.algorithms.progressive_resizing`	Applies Fastai's progressive resizing data augmentation to speed up training.
`composer.algorithms.randaugment`	Randomly applies a sequence of image data augmentations (Cubuk et al, 2019) to an image.
`composer.algorithms.sam`	SAM (Foret et al, 2020) wraps an existing optimizer with a `SAMOptimizer` which makes the optimizer minimize both loss value and sharpness.This can improves model generalization and provide robustness to label noise.
`composer.algorithms.selective_backprop`	Selective Backprop prunes minibatches according to the difficulty of the individual training examples, and only computes weight gradients over the pruned subset, reducing iteration time and speeding up training.
`composer.algorithms.seq_length_warmup`	Sequence length warmup progressively increases the sequence length during training of NLP models.
`composer.algorithms.squeeze_excite`	Adds Squeeze-and-Excitation blocks (Hu et al, 2019) after the `Conv2d` modules in a neural network.
`composer.algorithms.stochastic_depth`	Implements stochastic depth (Huang et al, 2016) for ResNet blocks.
`composer.algorithms.swa`	Stochastic Weight Averaging (SWA; Izmailov et al, 2018) averages model weights sampled at different times near the end of training.
`composer.algorithms.utils`	Helper utilities for algorithms.
`composer.algorithms.warnings`	composer.algorithms.warnings

Efficiency methods for training.

Examples include LabelSmoothing and adding SqueezeExcite blocks, among many others.

Algorithms are implemented in both a standalone functional form (see composer.functional) and as subclasses of Algorithm for integration in the Composer Trainer. The former are easier to integrate piecemeal into an existing codebase. The latter are easier to compose together, since they all have the same public interface and work automatically with the Composer Trainer.

For ease of composability, algorithms in our Trainer are based on the two-way callbacks concept from Howard et al, 2020. Each algorithm implements two methods:

Algorithm.match(): returns True if the algorithm should be run given the current State and Event.
Algorithm.apply(): performs an in-place modification of the given State

For example, a simple algorithm that shortens training:

from composer import Algorithm, State, Event, Logger

class ShortenTraining(Algorithm):

    def match(self, state: State, event: Event, logger: Logger) -> bool:
        return event == Event.INIT

    def apply(self, state: State, event: Event, logger: Logger):
        state.max_duration /= 2  # cut training time in half

For more information about events, see Event.

Classes

`Alibi`	ALiBi (Attention with Linear Biases; Press et al, 2021) dispenses with position embeddings and instead directly biases attention matrices such that nearby tokens attend to one another more strongly.
`AugMix`	The AugMix data augmentation technique.
`AugmentAndMixTransform`	Wrapper module for `augmix_image()` that can be passed to `torchvision.transforms.Compose`.
`BlurPool`	BlurPool adds anti-aliasing filters to convolutional layers.
`ChannelsLast`	Changes the memory format of the model to torch.channels_last.
`ColOut`	Drops a fraction of the rows and columns of an input image and (optionally) a target image.
`ColOutTransform`	Torchvision-like transform for performing the ColOut augmentation, where random rows and columns are dropped from up to two Torch tensors or two PIL images.
`CutMix`	CutMix trains the network on non-overlapping combinations of pairs of examples and interpolated targets rather than individual examples and targets.
`CutOut`	CutOut is a data augmentation technique that works by masking out one or more square regions of an input image.
`EMA`	Maintains a shadow model with weights that follow the exponential moving average of the trained model weights.
`Factorize`	Decomposes linear operators into pairs of smaller linear operators.
`FusedLayerNorm`	Replaces all instances of torch.nn.LayerNorm with a apex.normalization.fused_layer_norm.FusedLayerNorm.
`GatedLinearUnits`	Replaces all instances of Linear layers in the feed-forward subnetwork with a Gated Linear Unit.
`GhostBatchNorm`	Replaces batch normalization modules with Ghost Batch Normalization modules that simulate the effect of using a smaller batch size.
`GradientClipping`	Clips all gradients in model based on specified clipping_type.
`LabelSmoothing`	Shrink targets towards a uniform distribution as in Szegedy et al.
`LayerFreezing`	Progressively freeze the layers of the network during training, starting with the earlier layers.
`MixUp`	MixUp trains the network on convex batch combinations.
`NoOpModel`	Runs on `Event.INIT` and replaces the model with a dummy `NoOpModelClass` instance.
`ProgressiveResizing`	Resize inputs and optionally outputs by cropping or interpolating.
`RandAugment`	Randomly applies a sequence of image data augmentations to an image.
`RandAugmentTransform`	Wraps `randaugment_image()` in a `torchvision`-compatible transform.
`SAM`	Adds sharpness-aware minimization (Foret et al, 2020) by wrapping an existing optimizer with a `SAMOptimizer`.
`SWA`	Applies Stochastic Weight Averaging (Izmailov et al, 2018).
`SelectiveBackprop`	Selectively backpropagate gradients from a subset of each batch.
`SeqLengthWarmup`	Progressively increases the sequence length during training.
`SqueezeExcite`	Adds Squeeze-and-Excitation blocks (Hu et al, 2019) after the `torch.nn.Conv2d` modules in a neural network.
`SqueezeExcite2d`	Squeeze-and-Excitation block from (Hu et al, 2019)
`SqueezeExciteConv2d`	Helper class used to add a `SqueezeExcite2d` module after a `torch.nn.Conv2d`.
`StochasticDepth`	Applies Stochastic Depth (Huang et al, 2016) to the specified model.