composer.algorithms.cutmix.cutmix#
Core CutMix classes and functions.
Functions
Create new samples using combinations of pairs of samples. |
Classes
CutMix trains the network on non-overlapping combinations of pairs of examples and iterpolated targets rather than individual examples and targets. |
- class composer.algorithms.cutmix.cutmix.CutMix(num_classes, alpha=1.0)[source]#
Bases:
composer.core.algorithm.Algorithm
CutMix trains the network on non-overlapping combinations of pairs of examples and iterpolated targets rather than individual examples and targets.
This is done by taking a non-overlapping combination of a given batch X with a randomly permuted copy of X. The area is drawn from a
Beta(alpha, alpha)
distribution.Training in this fashion sometimes reduces generalization error.
Example
from composer.algorithms import CutMix from composer.trainer import Trainer cutmix_algorithm = CutMix(num_classes=1000, alpha=1.0) trainer = Trainer( model=model, train_dataloader=train_dataloader, eval_dataloader=eval_dataloader, max_duration="1ep", algorithms=[cutmix_algorithm], optimizers=[optimizer] )
- Parameters
num_classes (int) โ the number of classes in the task labels.
alpha (float) โ the psuedocount for the Beta distribution used to sample area parameters. As
alpha
grows, the two samples in each pair tend to be weighted more equally. Asalpha
approaches 0 from above, the combination approaches only using one element of the pair.
- composer.algorithms.cutmix.cutmix.cutmix_batch(X, y, n_classes, alpha=1.0, cutmix_lambda=None, bbox=None, indices=None)[source]#
Create new samples using combinations of pairs of samples.
This is done by masking a region of X, and filling the masked region with a permuted copy of x. The cutmix parameter lambda should be chosen from a Beta(alpha, alpha) distribution for some parameter alpha > 0. The area of the masked region is determined by lambda, and so labels are interpolated accordingly. Note that the same lambda is used for all examples within the batch. The original paper used a fixed value of alpha = 1.
Both the original and shuffled labels are returned. This is done because for many loss functions (such as cross entropy) the targets are given as indices, so interpolation must be handled separately.
Example
from composer.algorithms.cutmix import cutmix_batch new_input_batch = cutmix_batch( X=X_example, y=y_example, n_classes=1000, alpha=1.0 )
- Parameters
X โ input tensor of shape (B, d1, d2, โฆ, dn), B is batch size, d1-dn are feature dimensions.
y โ target tensor of shape (B, f1, f2, โฆ, fm), B is batch size, f1-fn are possible target dimensions.
n_classes โ total number of classes.
alpha โ parameter for the beta distribution of the cutmix region size.
cutmix_lambda โ optional, fixed size of cutmix region.
bbox โ optional, predetermined (rx1, ry1, rx2, ry2) coords of the bounding box.
indices โ Permutation of the batch indices 1..B. Used for permuting without randomness.
- Returns
X_cutmix โ batch of inputs after cutmix has been applied.
y_cutmix โ labels after cutmix has been applied.