composer.algorithms.cutmix.cutmix#

Core CutMix classes and functions.

Functions

cutmix_batch

Create new samples using combinations of pairs of samples.

Classes

CutMix

CutMix trains the network on non-overlapping combinations of pairs of examples and iterpolated targets rather than individual examples and targets.

class composer.algorithms.cutmix.cutmix.CutMix(num_classes, alpha=1.0)[source]#

Bases: composer.core.algorithm.Algorithm

CutMix trains the network on non-overlapping combinations of pairs of examples and iterpolated targets rather than individual examples and targets.

This is done by taking a non-overlapping combination of a given batch X with a randomly permuted copy of X. The area is drawn from a Beta(alpha, alpha) distribution.

Training in this fashion sometimes reduces generalization error.

Example

from composer.algorithms import CutMix
from composer.trainer import Trainer
cutmix_algorithm = CutMix(num_classes=1000, alpha=1.0)
trainer = Trainer(
    model=model,
    train_dataloader=train_dataloader,
    eval_dataloader=eval_dataloader,
    max_duration="1ep",
    algorithms=[cutmix_algorithm],
    optimizers=[optimizer]
)
Parameters
  • num_classes (int) โ€“ the number of classes in the task labels.

  • alpha (float) โ€“ the psuedocount for the Beta distribution used to sample area parameters. As alpha grows, the two samples in each pair tend to be weighted more equally. As alpha approaches 0 from above, the combination approaches only using one element of the pair.

apply(event, state, logger)[source]#

Applies CutMix augmentation on State input.

Parameters
  • event (Event) โ€“ the current event

  • state (State) โ€“ the current trainer state

  • logger (Logger) โ€“ the training logger

match(event, state)[source]#

Runs on Event.INIT and Event.AFTER_DATALOADER.

Parameters
  • event (Event) โ€“ The current event.

  • state (State) โ€“ The current state.

Returns

bool โ€“ True if this algorithm should run now.

composer.algorithms.cutmix.cutmix.cutmix_batch(X, y, n_classes, alpha=1.0, cutmix_lambda=None, bbox=None, indices=None)[source]#

Create new samples using combinations of pairs of samples.

This is done by masking a region of X, and filling the masked region with a permuted copy of x. The cutmix parameter lambda should be chosen from a Beta(alpha, alpha) distribution for some parameter alpha > 0. The area of the masked region is determined by lambda, and so labels are interpolated accordingly. Note that the same lambda is used for all examples within the batch. The original paper used a fixed value of alpha = 1.

Both the original and shuffled labels are returned. This is done because for many loss functions (such as cross entropy) the targets are given as indices, so interpolation must be handled separately.

Example

from composer.algorithms.cutmix import cutmix_batch
new_input_batch = cutmix_batch(
    X=X_example,
    y=y_example,
    n_classes=1000,
    alpha=1.0
)
Parameters
  • X โ€“ input tensor of shape (B, d1, d2, โ€ฆ, dn), B is batch size, d1-dn are feature dimensions.

  • y โ€“ target tensor of shape (B, f1, f2, โ€ฆ, fm), B is batch size, f1-fn are possible target dimensions.

  • n_classes โ€“ total number of classes.

  • alpha โ€“ parameter for the beta distribution of the cutmix region size.

  • cutmix_lambda โ€“ optional, fixed size of cutmix region.

  • bbox โ€“ optional, predetermined (rx1, ry1, rx2, ry2) coords of the bounding box.

  • indices โ€“ Permutation of the batch indices 1..B. Used for permuting without randomness.

Returns
  • X_cutmix โ€“ batch of inputs after cutmix has been applied.

  • y_cutmix โ€“ labels after cutmix has been applied.