composer.algorithms.mixup.mixup#
Core MixUp classes and functions.
Functions
Create new samples using convex combinations of pairs of samples. |
Classes
MixUp trains the network on convex combinations of pairs of examples and targets rather than individual examples and targets. |
- class composer.algorithms.mixup.mixup.MixUp(alpha=0.2, interpolate_loss=False)[source]#
Bases:
composer.core.algorithm.Algorithm
MixUp trains the network on convex combinations of pairs of examples and targets rather than individual examples and targets.
This is done by taking a convex combination of a given batch X with a randomly permuted copy of X. The mixing coefficient is drawn from a
Beta(alpha, alpha)
distribution.Training in this fashion sometimes reduces generalization error.
- Parameters
alpha (float, optional) โ the psuedocount for the Beta distribution used to sample mixing parameters. As
alpha
grows, the two samples in each pair tend to be weighted more equally. Asalpha
approaches 0 from above, the combination approaches only using one element of the pair. Default:0.2
.interpolate_loss (bool, optional) โ Interpolates the loss rather than the labels. A useful trick when using a cross entropy loss. Will produce incorrect behavior if the loss is not a linear function of the targets. Default:
False
Example
from composer.algorithms import MixUp algorithm = MixUp(alpha=0.2) trainer = Trainer( model=model, train_dataloader=train_dataloader, eval_dataloader=eval_dataloader, max_duration="1ep", algorithms=[algorithm], optimizers=[optimizer] )
- composer.algorithms.mixup.mixup.mixup_batch(input, target, mixing=None, alpha=0.2, indices=None)[source]#
Create new samples using convex combinations of pairs of samples.
This is done by taking a convex combination of
input
with a randomly permuted copy ofinput
. The permutation takes place along the sample axis (dim 0).The relative weight of the original
input
versus the permuted copy is defined by themixing
parameter. This parameter should be chosen from aBeta(alpha, alpha)
distribution for some parameteralpha > 0
. Note that the samemixing
is used for the whole batch.- Parameters
input (Tensor) โ input tensor of shape
(minibatch, ...)
, where...
indicates zero or more dimensions.target (Tensor) โ target tensor of shape
(minibatch, ...)
, where...
indicates zero or more dimensions.mixing (float, optional) โ coefficient used to interpolate between the two examples. If provided, must be in \([0, 1]\). If
None
, value is drawn from aBeta(alpha, alpha)
distribution. Default:None
.alpha (float, optional) โ parameter for the Beta distribution over
mixing
. Ignored ifmixing
is provided. Default:0.2
.indices (Tensor, optional) โ Permutation of the samples to use. Default:
None
.
- Returns
input_mixed (torch.Tensor) โ batch of inputs after mixup has been applied
target_perm (torch.Tensor) โ The labels of the mixed-in examples
mixing (torch.Tensor) โ the amount of mixing used
Example
import torch from composer.functional import mixup_batch N, C, H, W = 2, 3, 4, 5 X = torch.randn(N, C, H, W) y = torch.randint(num_classes, size=(N,)) X_mixed, y_perm, mixing = mixup_batch( X, y, alpha=0.2)