composer.algorithms.factorize.factorize#

composer.algorithms.factorize.factorize

Functions

apply_factorization

Replaces Linear and Conv2d modules and with FactorizedLinear and FactorizedConv2d modules.

cast

Cast a value to a type.

factorizing_could_speedup

Whether factorizing a module a given amount could possibly yield a benefit.

Classes

Algorithm

Base class for algorithms.

Event

Enum to represent events in the training loop.

Factorize

Decomposes linear operators into pairs of smaller linear operators.

FactorizedConv2d

Factorized replacement for torch.nn.Conv2d.

FactorizedLinear

Factorized replacement for torch.nn.Linear.

Logger

An interface to record training data.

Optimizer

Base class for all optimizers.

State

The state of the trainer.

Attributes

  • LOG_NUM_CONV2D_REPLACEMENTS_KEY

  • LOG_NUM_LINEAR_REPLACEMENTS_KEY

  • Optional

  • Sequence

  • Type

  • Union

  • annotations

  • log

class composer.algorithms.factorize.factorize.Factorize(factorize_convs=True, factorize_linears=True, min_channels=256, latent_channels=0.25, min_features=256, latent_features=128)[source]#

Bases: composer.core.algorithm.Algorithm

Decomposes linear operators into pairs of smaller linear operators.

Specifically, this algorithm replaces Conv2d and Linear modules with FactorizedConv2d and FactorizedLinear modules.

The replacement is only performed if doing so would reduce the number of multiply-adds used to compute each moduleโ€™s output. For linear layers and pointwise convolutions, this means that the factorization must use an intermediate rank of less than half the input and output ranks, since it must perform two operations instead of one.

For convolutions with kernel sizes greater than 1, the threshold for factorization being worthwhile varies with kernel size. Larger kernels allow larger intermediate ranks.

See factorize_matrix() and factorize_conv2d() for more information about the factorization process. See FactorizedConv2d and FactorizedLinear for more information about the factorized modules used to replace the original modules.

Runs on INIT.

Parameters
  • factorize_convs (bool) โ€“ whether to try factorizing Conv2d modules. Default: True.

  • factorize_linears (bool) โ€“ whether to try factorizing Linear modules. Default: True.

  • min_channels (int) โ€“ if a Conv2d module does not have at least this many input and output channels, it will be ignored. Modules with few channels are unlikely to be accelerated by factorization due to poor hardware utilization. Default: 256.

  • latent_channels (int, float) โ€“ number of latent channels to use in factorized convolutions. Can be specified as either an integer > 1 or as float within [0, 1). In the latter case, the value is interpreted as a fraction of min(in_channels, out_channels) for each Conv2d module, and is converted to the equivalent integer value, with a minimum of 1. Default: 0.25.

  • min_features (int) โ€“ if a Linear module does not have at least this many input and output features, it will be ignored. Modules with few features are unlikely to be accelerated by factorization due to poor hardware utilization. Default: 256.

  • latent_features (int, float) โ€“ size of the latent space for factorized linear modules. Can be specified as either an integer > 1 or as a float within [0, 0.5). In the latter case, the value is interpreted as a fraction of min(in_features, out_features) for each Linear module, and is converted to the equivalent integer value, with a minimum of 1. Default: 128.

apply(event, state, logger)[source]#

Factorize convolutional and linear layers.

Parameters
  • event (Event) โ€“ the current event

  • state (State) โ€“ the current trainer state

  • logger (Logger) โ€“ the training logger

match(event, state)[source]#

Runs on INIT.

Parameters
  • event (Event) โ€“ The current event.

  • state (State) โ€“ The current state.

Returns

bool โ€“ True if this algorithm should run

composer.algorithms.factorize.factorize.apply_factorization(model, factorize_convs=True, factorize_linears=True, min_channels=512, latent_channels=0.25, min_features=512, latent_features=0.25, optimizers=None)[source]#

Replaces Linear and Conv2d modules and with FactorizedLinear and FactorizedConv2d modules.

Factorized modules replace one full-rank operation with a sequence of two lower-rank operations. When the rank is low enough, this can save computation, at the cost of expressive power. See Factorize for details.

Parameters
  • model (Module) โ€“ the model to modify in-place

  • factorize_convs (bool, optional) โ€“ whether to try factorizing Conv2d modules. Default: True.

  • factorize_linears (bool, optional) โ€“ whether to try factorizing Linear modules. Default: True.

  • min_channels (int, optional) โ€“ if a Conv2d module does not have at least this many input and output channels, it will be ignored. Modules with few channels are unlikely to be accelerated by factorization due to poor hardware utilization. Default: 512.

  • latent_channels (int or float, optional) โ€“ number of latent channels to use in factorized convolutions. Can be specified as either an integer > 1 or as float within [0, 1). In the latter case, the value is interpreted as a fraction of min(in_channels, out_channels) for each Conv2d module, and is converted to the equivalent integer value, with a minimum of 1. Default: 0.25.

  • min_features (int, optional) โ€“ if a Linear module does not have at least this many input and output features, it will be ignored. Modules with few features are unlikely to be accelerated by factorization due to poor hardware utilization. Default: 512.

  • latent_features (int or float, optional) โ€“ size of the latent space for factorized linear modules. Can be specified as either an integer > 1 or as a float within [0, 0.5). In the latter case, the value is interpreted as a fraction of min(in_features, out_features) for each Linear module, and is converted to the equivalent integer value, with a minimum of 1. Default: 0.25.

  • optimizers (Optimizer | Sequence[Optimizer], optional) โ€“

    Existing optimizers bound to model.parameters(). All optimizers that have already been constructed with model.parameters() must be specified here so they will optimize the correct parameters.

    If the optimizer(s) are constructed after calling this function, then it is safe to omit this parameter. These optimizers will see the correct model parameters.

Returns

The modified model

Example

import composer.functional as cf
from torchvision import models
model = models.resnet50()
cf.apply_factorization(model)