composer.optim#

Modules

composer.optim.decoupled_weight_decay

Optimizers with weight decay decoupled from the learning rate.

composer.optim.optimizer_hparams

Hyperparameters for optimizers.

composer.optim.scheduler

Stateless learning rate schedulers.

composer.optim.scheduler_hparams

Hyperparameters for schedulers.

Optimizers and learning rate schedulers.

Composer is compatible with optimizers based off of PyTorchโ€™s native Optimizer API, and common optimizers such as SGD and Adam have been thoroughly tested with Composer. However, where applicable, it is recommended to use the optimizers provided in decoupled_weight_decay since they improve off of their PyTorch equivalents.

PyTorch schedulers can be used with Composer, but this is explicitly discouraged. Instead, it is recommended to use schedulers based off of Composerโ€™s ComposerScheduler API, which allows more flexibility and configuration in writing schedulers.

Classes

ComposerScheduler

Specification for a stateless scheduler function.

ConstantScheduler

Maintains a fixed learning rate.

CosineAnnealingScheduler

Decays the learning rate according to the decreasing part of a cosine curve.

CosineAnnealingWarmRestartsScheduler

Cyclically decays the learning rate according to the decreasing part of a cosine curve.

CosineAnnealingWithWarmupScheduler

Decays the learning rate according to the decreasing part of a cosine curve, with an initial warmup.

DecoupledAdamW

Adam optimizer with the weight decay term decoupled from the learning rate.

DecoupledSGDW

SGD optimizer with the weight decay term decoupled from the learning rate.

ExponentialScheduler

Decays the learning rate exponentially.

LinearScheduler

Adjusts the learning rate linearly.

LinearWithWarmupScheduler

Adjusts the learning rate linearly, with an initial warmup.

MultiStepScheduler

Decays the learning rate discretely at fixed milestones.

MultiStepWithWarmupScheduler

Decays the learning rate discretely at fixed milestones, with an initial warmup.

PolynomialScheduler

Sets the learning rate to be proportional to a power of the fraction of training time left.

StepScheduler

Decays the learning rate discretely at fixed intervals.

Hparams

These classes are used with yahp for YAML-based configuration.

AdamHparams

Hyperparameters for the Adam optimizer.

AdamWHparams

Hyperparameters for the AdamW optimizer.

ConstantSchedulerHparams

Hyperparameters for the ConstantScheduler scheduler.

CosineAnnealingSchedulerHparams

Hyperparameters for the CosineAnnealingScheduler scheduler.

CosineAnnealingWarmRestartsSchedulerHparams

Hyperparameters for the CosineAnnealingWarmRestartsScheduler scheduler.

CosineAnnealingWithWarmupSchedulerHparams

Hyperparameters for the CosineAnnealingWithWarmupScheduler scheduler.

DecoupledAdamWHparams

Hyperparameters for the DecoupledAdamW optimizer.

DecoupledSGDWHparams

Hyperparameters for the DecoupledSGDW optimizer.

ExponentialSchedulerHparams

Hyperparameters for the ExponentialScheduler scheduler.

LinearSchedulerHparams

Hyperparameters for the LinearScheduler scheduler.

LinearWithWarmupSchedulerHparams

Hyperparameters for the LinearWithWarmupScheduler scheduler.

MultiStepSchedulerHparams

Hyperparameters for the MultiStepScheduler scheduler.

MultiStepWithWarmupSchedulerHparams

Hyperparameters for the MultiStepWithWarmupScheduler scheduler.

OptimizerHparams

Base class for optimizer hyperparameter classes.

PolynomialSchedulerHparams

Hyperparameters for the PolynomialScheduler scheduler.

RAdamHparams

Hyperparameters for the RAdam optimizer.

RMSpropHparams

Hyperparameters for the RMSprop optimizer.

SGDHparams

Hyperparameters for the SGD optimizer.

SchedulerHparams

Base class for scheduler hyperparameter classes.

StepSchedulerHparams

Hyperparameters for the StepScheduler scheduler.