ƒ() Functional#

The simplest way to use Composer’s algorithms is via the functional API. Composer’s algorithms can be grouped into three, broad classes:

  • data augmentations add additional transforms to the training data.

  • model surgery algorithms modify the network architecture.

  • training loop modifications change the logic in the training loop.

Data augmentations can be inserted either into the dataloader as a transform or after a batch has been loaded, depending on what the augmentation acts on. Here is an example of using 🎲 RandAugment with the functional API.

import torch
from torchvision import datasets, transforms

from composer import functional as cf

c10_transforms = transforms.Compose([cf.randaugment(), # <---- Add RandAugment
                                    transforms.ToTensor(),
                                    transforms.Normalize(mean, std)])

dataset = datasets.CIFAR10('../data',
                        train=True,
                        download=True,
                        transform=c10_transforms)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=1024)

Other data augmentations, such as CutMix, act on a batch of inputs. These can be inserted in the training loop after a batch is loaded from the dataloader as follows:

from composer import functional as cf

cutmix_alpha = 1
num_classes = 10
for batch_idx, (data, target) in enumerate(dataloader):
    data = cf.cutmix(  # <-- insert cutmix
        data,
        target,
        alpha=cutmix_alpha,
        num_classes=num_classes
    )
    optimizer.zero_grad()
    output = model(data)
    loss = loss_fn(output, target)
    loss.backward()
    optimizer.step()

Model surgery algorithms make direct modifications to the network itself. Functionally, these can be called as follows, using BlurPool as an example

import torchvision.models as models

from composer import functional as cf

model = models.resnet18()
cf.apply_blurpool(model)

Each method card has a section describing how to use these methods in your own trainer loop.

Functional API for applying algorithms in your own training loop.

from composer import functional as cf
from torchvision import models

model = models.resnet50()

# replace some layers with blurpool
cf.apply_blurpool(model)
# replace some layers with squeeze-excite
cf.apply_squeeze_excite(model, latent_channels=64, min_channels=128)

Functions

apply_agc

Clips all gradients in model based on ratio of gradient norms to parameter norms.

apply_alibi

Removes position embeddings and replaces the attention function and attention mask as per Alibi.

apply_blurpool

Add anti-aliasing filters to the strided torch.nn.Conv2d and/or torch.nn.MaxPool2d modules within model.

apply_channels_last

Changes the memory format of the model to torch.channels_last.

apply_factorization

Replaces Linear and Conv2d modules and with FactorizedLinear and FactorizedConv2d modules.

apply_ghost_batchnorm

Replace batch normalization modules with ghost batch normalization modules.

apply_squeeze_excite

Adds Squeeze-and-Excitation blocks (Hu et al, 2019) after Conv2d layers.

apply_stochastic_depth

Applies Stochastic Depth (Huang et al, 2016) to the specified model.

augmix_image

Applies AugMix (Hendrycks et al, 2020) data augmentation to a single image or batch of images.

colout_batch

Applies ColOut augmentation to a batch of images and (optionally) targets, dropping the same random rows and columns from all images and targets in a batch.

compute_ema

Updates the weights of ema_model to be closer to the weights of model according to an exponential weighted average.

cutmix_batch

Create new samples using combinations of pairs of samples.

cutout_batch

See CutOut.

freeze_layers

Progressively freeze the layers of the network in-place during training, starting with the earlier layers.

mixup_batch

Create new samples using convex combinations of pairs of samples.

randaugment_image

Randomly applies a sequence of image data augmentations (Cubuk et al, 2019) to an image or batch of images.

resize_batch

Resize inputs and optionally outputs by cropping or interpolating.

select_using_loss

Prunes minibatches as a subroutine of SelectiveBackprop.

set_batch_sequence_length

Set the sequence length of a batch.

should_selective_backprop

Decides if selective backprop should be run based on time in training.

smooth_labels

Shrink targets towards a uniform distribution as in Szegedy et al.