algorithm_hparams_registry#
Module algorithm_hparams_registry
.
Classes
Base class for algorithms. |
|
ALiBi (Attention with Linear Biases; Press et al, 2021) dispenses with position embeddings and instead directly biases attention matrices such that nearby tokens attend to one another more strongly. |
|
The AugMix data augmentation technique. |
|
BlurPool adds anti-aliasing filters to convolutional layers. |
|
Changes the memory format of the model to torch.channels_last. |
|
Drops a fraction of the rows and columns of an input image and (optionally) a target image. |
|
CutMix trains the network on non-overlapping combinations of pairs of examples and interpolated targets rather than individual examples and targets. |
|
CutOut is a data augmentation technique that works by masking out one or more square regions of an input image. |
|
Maintains a shadow model with weights that follow the exponential moving average of the trained model weights. |
|
Decomposes linear operators into pairs of smaller linear operators. |
|
Replaces all instances of torch.nn.LayerNorm with a apex.normalization.fused_layer_norm.FusedLayerNorm. |
|
Replaces all instances of Linear layers in the feed-forward subnetwork with a Gated Linear Unit. |
|
Replaces batch normalization modules with Ghost Batch Normalization modules that simulate the effect of using a smaller batch size. |
|
Clips all gradients in model based on specified clipping_type. |
|
Shrink targets towards a uniform distribution as in Szegedy et al. |
|
Progressively freeze the layers of the network during training, starting with the earlier layers. |
|
MixUp trains the network on convex batch combinations. |
|
Runs on |
|
Resize inputs and optionally outputs by cropping or interpolating. |
|
Randomly applies a sequence of image data augmentations to an image. |
|
Adds sharpness-aware minimization (Foret et al, 2020) by wrapping an existing optimizer with a |
|
Applies Stochastic Weight Averaging (Izmailov et al, 2018). |
|
Selectively backpropagate gradients from a subset of each batch. |
|
Progressively increases the sequence length during training. |
|
Adds Squeeze-and-Excitation blocks (Hu et al, 2019) after the |
|
Applies Stochastic Depth (Huang et al, 2016) to the specified model. |
Attributes
Dict
Type
Union
algorithm_registry