composer.optim.scheduler#
composer.optim.scheduler
Functions
Return the fields of a dataclass instance as a new dictionary mapping field names to field values. |
|
|
composer.optim.scheduler.compile |
Maintains a fixed learning rate. |
|
Decays the learning rate according to the decreasing part of a cosine curve. |
|
Cyclically decays the learning rate according to the decreasing part of a cosine curve. |
|
Decays the learning rate according to the decreasing part of a cosine curve, with a linear warmup. |
|
Returns the same class as was passed in, with dunder methods added based on the fields defined in the class. |
|
Decays the learning rate exponentially. |
|
Adjusts the learning rate linearly. |
|
Adjusts the learning rate linearly, with a linear warmup. |
|
Decays the learning rate discretely at fixed milestones. |
|
Decays the learning rate discretely at fixed milestones, with a linear warmup. |
|
Sets the learning rate to be exponentially proportional to the percentage of training time left. |
|
Decays the learning rate discretely at fixed intervals. |
Classes
Helper class that provides a standard way to create an ABC using inheritance. |
|
Specification for a "stateless" scheduler function. |
|
Sets the learning rate of each parameter group to the initial lr times a given function. |
|
Base class for protocol classes. |
|
|
torch.optim.lr_scheduler._LRScheduler |
The state of the trainer. |
|
Time represents static durations of training time or points in the training process in terms of a |
|
Units of time for the training process. |
Hparams
These classes are used with yahp
for YAML
-based configuration.
Hyperparameters for the |
|
Hyperparameters for the |
|
Hyperparameters for the |
|
Hyperparameters for the |
|
Hyperparameters for the |
|
Hyperparameters for the |
|
Hyperparameters for the |
|
Hyperparameters for the |
|
Hyperparameters for the |
|
Hyperparameters for the |
|
composer.optim.scheduler.SchedulerHparams |
|
Hyperparameters for the |
Attributes
ComposerScheduler
List
TYPE_CHECKING
Union
log
- class composer.optim.scheduler.ComposerSchedulerFn(*args, **kwargs)[source]#
Bases:
Protocol
Specification for a โstatelessโ scheduler function.
A scheduler function should be a pure function that returns a multiplier to apply to the optimizerโs provided learning rate, given the current trainer state, and optionally a โscale schedule ratioโ (SSR). A typical implementation will read state.timer, and possibly other fields like state.max_duration, to determine the trainerโs latest temporal progress.
- class composer.optim.scheduler.ConstantLRHparams(factor=1.0, total_time='1dur')[source]#
Bases:
composer.optim.scheduler.SchedulerHparams
Hyperparameters for the
constant_scheduler()
scheduler.- scheduler_function(*, ssr=1.0, factor=1.0, total_time='1dur')#
Maintains a fixed learning rate.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.factor (float) โ Factor. Default =
1.0
.
- class composer.optim.scheduler.CosineAnnealingLRHparams(t_max='1dur', min_factor=0.0)[source]#
Bases:
composer.optim.scheduler.SchedulerHparams
Hyperparameters for the
cosine_annealing_scheduler()
scheduler.- scheduler_function(*, ssr=1.0, t_max='1dur', min_factor=0.0)#
Decays the learning rate according to the decreasing part of a cosine curve.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.min_factor (float) โ Minimum factor. Default =
0.0
.
- class composer.optim.scheduler.CosineAnnealingWarmRestartsHparams(t_0='1dur', min_factor=0.0, t_mult=1.0)[source]#
Bases:
composer.optim.scheduler.SchedulerHparams
Hyperparameters for the
cosine_annealing_warm_restarts_scheduler()
scheduler.- scheduler_function(*, ssr=1.0, t_0, t_mult=1.0, min_factor=0.0)#
Cyclically decays the learning rate according to the decreasing part of a cosine curve.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.t_mult (float) โ The multiplier for subsequent cyclesโ durations. Default =
1.0
.min_factor (float) โ Minimum factor. Default =
0.0
.
- class composer.optim.scheduler.CosineAnnealingWithWarmupLRHparams(warmup_time, t_max='1dur', min_factor=0.0)[source]#
Bases:
composer.optim.scheduler.SchedulerHparams
Hyperparameters for the
cosine_annealing_with_warmup_scheduler()
scheduler.- scheduler_function(*, ssr=1.0, warmup_time, t_max='1dur', min_factor=0.0)#
Decays the learning rate according to the decreasing part of a cosine curve, with a linear warmup.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.min_factor (float) โ Minimum factor. Default =
0.0
.
- class composer.optim.scheduler.ExponentialLRHparams(gamma)[source]#
Bases:
composer.optim.scheduler.SchedulerHparams
Hyperparameters for the
exponential_scheduler()
scheduler.- scheduler_function(*, ssr=1.0, gamma)#
Decays the learning rate exponentially.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.gamma (float) โ Gamma.
- class composer.optim.scheduler.LinearLRHparams(start_factor=1.0, end_factor=0.0, total_time='1dur')[source]#
Bases:
composer.optim.scheduler.SchedulerHparams
Hyperparameters for the
linear_scheduler()
scheduler.- scheduler_function(*, ssr=1.0, start_factor=1.0, end_factor=0.0, total_time='1dur')#
Adjusts the learning rate linearly.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.start_factor (float) โ Start factor. Default =
1.0
.end_factor (float) โ End factor. Default =
0.0
.
- class composer.optim.scheduler.LinearWithWarmupLRHparams(warmup_time, start_factor=1.0, end_factor=0.0, total_time='1dur')[source]#
Bases:
composer.optim.scheduler.SchedulerHparams
Hyperparameters for the
linear_with_warmup_scheduler()
scheduler.- scheduler_function(*, ssr=1.0, warmup_time, start_factor=1.0, end_factor=0.0, total_time='1dur')#
Adjusts the learning rate linearly, with a linear warmup.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.start_factor (float) โ Start factor. Default =
1.0
.end_factor (float) โ End factor. Default =
0.0
.
- class composer.optim.scheduler.MultiStepLRHparams(milestones, gamma=0.1)[source]#
Bases:
composer.optim.scheduler.SchedulerHparams
Hyperparameters for the
multi_step_scheduler()
scheduler.- scheduler_function(*, ssr=1.0, milestones, gamma=0.1)#
Decays the learning rate discretely at fixed milestones.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.milestones (list of str or Time) โ Milestones.
gamma (float) โ
- class composer.optim.scheduler.MultiStepWithWarmupLRHparams(warmup_time, milestones, gamma=0.1)[source]#
Bases:
composer.optim.scheduler.SchedulerHparams
Hyperparameters for the
multi_step_with_warmup_scheduler()
scheduler.- scheduler_function(*, ssr=1.0, warmup_time, milestones, gamma=0.1)#
Decays the learning rate discretely at fixed milestones, with a linear warmup.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.milestones (list of str or Time) โ Milestones.
gamma (float) โ
- class composer.optim.scheduler.PolynomialLRHparams(power, t_max='1dur', min_factor=0.0)[source]#
Bases:
composer.optim.scheduler.SchedulerHparams
Hyperparameters for the
polynomial_scheduler()
scheduler.- scheduler_function(*, ssr=1.0, t_max='1dur', power, min_factor=0.0)#
Sets the learning rate to be exponentially proportional to the percentage of training time left.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.power (float) โ Power.
min_factor (float) โ Minimum factor. Default =
0.0
.
- class composer.optim.scheduler.SchedulerHparams[source]#
Bases:
yahp.hparams.Hparams
,abc.ABC
composer.optim.scheduler.SchedulerHparams
- class composer.optim.scheduler.StepLRHparams(step_size, gamma=0.1)[source]#
Bases:
composer.optim.scheduler.SchedulerHparams
Hyperparameters for the
step_scheduler()
scheduler.- scheduler_function(*, ssr=1.0, step_size, gamma=0.1)#
Decays the learning rate discretely at fixed intervals.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.gamma (float) โ Gamma. Default =
0.1
.
- composer.optim.scheduler.constant_scheduler(state, *, ssr=1.0, factor=1.0, total_time='1dur')[source]#
Maintains a fixed learning rate.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.factor (float) โ Factor. Default =
1.0
.
- composer.optim.scheduler.cosine_annealing_scheduler(state, *, ssr=1.0, t_max='1dur', min_factor=0.0)[source]#
Decays the learning rate according to the decreasing part of a cosine curve.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.min_factor (float) โ Minimum factor. Default =
0.0
.
- composer.optim.scheduler.cosine_annealing_warm_restarts_scheduler(state, *, ssr=1.0, t_0, t_mult=1.0, min_factor=0.0)[source]#
Cyclically decays the learning rate according to the decreasing part of a cosine curve.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.t_mult (float) โ The multiplier for subsequent cyclesโ durations. Default =
1.0
.min_factor (float) โ Minimum factor. Default =
0.0
.
- composer.optim.scheduler.cosine_annealing_with_warmup_scheduler(state, *, ssr=1.0, warmup_time, t_max='1dur', min_factor=0.0)[source]#
Decays the learning rate according to the decreasing part of a cosine curve, with a linear warmup.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.min_factor (float) โ Minimum factor. Default =
0.0
.
- composer.optim.scheduler.exponential_scheduler(state, *, ssr=1.0, gamma)[source]#
Decays the learning rate exponentially.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.gamma (float) โ Gamma.
- composer.optim.scheduler.linear_scheduler(state, *, ssr=1.0, start_factor=1.0, end_factor=0.0, total_time='1dur')[source]#
Adjusts the learning rate linearly.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.start_factor (float) โ Start factor. Default =
1.0
.end_factor (float) โ End factor. Default =
0.0
.
- composer.optim.scheduler.linear_with_warmup_scheduler(state, *, ssr=1.0, warmup_time, start_factor=1.0, end_factor=0.0, total_time='1dur')[source]#
Adjusts the learning rate linearly, with a linear warmup.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.start_factor (float) โ Start factor. Default =
1.0
.end_factor (float) โ End factor. Default =
0.0
.
- composer.optim.scheduler.multi_step_scheduler(state, *, ssr=1.0, milestones, gamma=0.1)[source]#
Decays the learning rate discretely at fixed milestones.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.milestones (list of str or Time) โ Milestones.
gamma (float) โ
- composer.optim.scheduler.multi_step_with_warmup_scheduler(state, *, ssr=1.0, warmup_time, milestones, gamma=0.1)[source]#
Decays the learning rate discretely at fixed milestones, with a linear warmup.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.milestones (list of str or Time) โ Milestones.
gamma (float) โ
- composer.optim.scheduler.polynomial_scheduler(state, *, ssr=1.0, t_max='1dur', power, min_factor=0.0)[source]#
Sets the learning rate to be exponentially proportional to the percentage of training time left.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.power (float) โ Power.
min_factor (float) โ Minimum factor. Default =
0.0
.
- composer.optim.scheduler.step_scheduler(state, *, ssr=1.0, step_size, gamma=0.1)[source]#
Decays the learning rate discretely at fixed intervals.
- Parameters
state (State) โ The current Composer Trainer state.
ssr (float) โ The scale schedule ratio. In general, the learning rate computed by this scheduler at time \(t\) with an SSR of 1.0 should be the same as that computed by this scheduler at time \(t \times s\) with an SSR of \(s\). Default =
1.0
.gamma (float) โ Gamma. Default =
0.1
.