composer.algorithms.factorize.factorize#
composer.algorithms.factorize.factorize
Functions
Replaces |
|
Cast a value to a type. |
|
Whether factorizing a module a given amount could possibly yield a benefit. |
Classes
Base class for algorithms. |
|
Enum to represent events in the training loop. |
|
Decomposes linear operators into pairs of smaller linear operators. |
|
Factorized replacement for |
|
Factorized replacement for |
|
An interface to record training data. |
|
|
Base class for all optimizers. |
The state of the trainer. |
Attributes
LOG_NUM_CONV2D_REPLACEMENTS_KEY
LOG_NUM_LINEAR_REPLACEMENTS_KEY
Optional
Sequence
Type
Union
annotations
log
- class composer.algorithms.factorize.factorize.Factorize(factorize_convs=True, factorize_linears=True, min_channels=256, latent_channels=0.25, min_features=256, latent_features=128)[source]#
Bases:
composer.core.algorithm.Algorithm
Decomposes linear operators into pairs of smaller linear operators.
Specifically, this algorithm replaces
Conv2d
andLinear
modules withFactorizedConv2d
andFactorizedLinear
modules.The replacement is only performed if doing so would reduce the number of multiply-adds used to compute each moduleโs output. For linear layers and pointwise convolutions, this means that the factorization must use an intermediate rank of less than half the input and output ranks, since it must perform two operations instead of one.
For convolutions with kernel sizes greater than 1, the threshold for factorization being worthwhile varies with kernel size. Larger kernels allow larger intermediate ranks.
See
factorize_matrix()
andfactorize_conv2d()
for more information about the factorization process. SeeFactorizedConv2d
andFactorizedLinear
for more information about the factorized modules used to replace the original modules.Runs on
INIT
.- Parameters
factorize_convs (bool) โ whether to try factorizing
Conv2d
modules. Default:True
.factorize_linears (bool) โ whether to try factorizing
Linear
modules. Default:True
.min_channels (int) โ if a
Conv2d
module does not have at least this many input and output channels, it will be ignored. Modules with few channels are unlikely to be accelerated by factorization due to poor hardware utilization. Default:256
.latent_channels (int, float) โ number of latent channels to use in factorized convolutions. Can be specified as either an integer > 1 or as float within [0, 1). In the latter case, the value is interpreted as a fraction of
min(in_channels, out_channels)
for eachConv2d
module, and is converted to the equivalent integer value, with a minimum of 1. Default:0.25
.min_features (int) โ if a
Linear
module does not have at least this many input and output features, it will be ignored. Modules with few features are unlikely to be accelerated by factorization due to poor hardware utilization. Default:256
.latent_features (int, float) โ size of the latent space for factorized linear modules. Can be specified as either an integer > 1 or as a float within [0, 0.5). In the latter case, the value is interpreted as a fraction of
min(in_features, out_features)
for eachLinear
module, and is converted to the equivalent integer value, with a minimum of 1. Default:128
.
- composer.algorithms.factorize.factorize.apply_factorization(model, factorize_convs=True, factorize_linears=True, min_channels=512, latent_channels=0.25, min_features=512, latent_features=0.25, optimizers=None)[source]#
Replaces
Linear
andConv2d
modules and withFactorizedLinear
andFactorizedConv2d
modules.Factorized modules replace one full-rank operation with a sequence of two lower-rank operations. When the rank is low enough, this can save computation, at the cost of expressive power. See
Factorize
for details.- Parameters
model (Module) โ the model to modify in-place
factorize_convs (bool, optional) โ whether to try factorizing
Conv2d
modules. Default:True
.factorize_linears (bool, optional) โ whether to try factorizing
Linear
modules. Default:True
.min_channels (int, optional) โ if a
Conv2d
module does not have at least this many input and output channels, it will be ignored. Modules with few channels are unlikely to be accelerated by factorization due to poor hardware utilization. Default:512
.latent_channels (int or float, optional) โ number of latent channels to use in factorized convolutions. Can be specified as either an integer > 1 or as float within [0, 1). In the latter case, the value is interpreted as a fraction of
min(in_channels, out_channels)
for eachConv2d
module, and is converted to the equivalent integer value, with a minimum of 1. Default:0.25
.min_features (int, optional) โ if a
Linear
module does not have at least this many input and output features, it will be ignored. Modules with few features are unlikely to be accelerated by factorization due to poor hardware utilization. Default:512
.latent_features (int or float, optional) โ size of the latent space for factorized linear modules. Can be specified as either an integer > 1 or as a float within [0, 0.5). In the latter case, the value is interpreted as a fraction of
min(in_features, out_features)
for eachLinear
module, and is converted to the equivalent integer value, with a minimum of 1. Default:0.25
.optimizers (Optimizer | Sequence[Optimizer], optional) โ
Existing optimizers bound to
model.parameters()
. All optimizers that have already been constructed withmodel.parameters()
must be specified here so they will optimize the correct parameters.If the optimizer(s) are constructed after calling this function, then it is safe to omit this parameter. These optimizers will see the correct model parameters.
- Returns
The modified model
Example
import composer.functional as cf from torchvision import models model = models.resnet50() cf.apply_factorization(model)