composer.algorithms.sam.sam#

SAM algorithm and optimizer class.

Classes

SAM

Adds sharpness-aware minimization (Foret et al, 2020) by wrapping an existing optimizer with a SAMOptimizer.

SAMOptimizer

Wraps an optimizer with sharpness-aware minimization (Foret et al, 2020).

class composer.algorithms.sam.sam.SAM(rho=0.05, epsilon=1e-12, interval=1)[source]#

Bases: composer.core.algorithm.Algorithm

Adds sharpness-aware minimization (Foret et al, 2020) by wrapping an existing optimizer with a SAMOptimizer. SAM can improve model generalization and provide robustness to label noise.

Runs on INIT.

Parameters
  • rho (float, optional) โ€“ The neighborhood size parameter of SAM. Must be greater than 0. Default: 0.05.

  • epsilon (float, optional) โ€“ A small value added to the gradient norm for numerical stability. Default: 1e-12.

  • interval (int, optional) โ€“ SAM will run once per interval steps. A value of 1 will cause SAM to run every step. Steps on which SAM runs take roughly twice as much time to complete. Default: 1.

Example

from composer.algorithms import SAM
algorithm = SAM(rho=0.05, epsilon=1.0e-12, interval=1)
trainer = Trainer(
    model=model,
    train_dataloader=train_dataloader,
    eval_dataloader=eval_dataloader,
    max_duration="1ep",
    algorithms=[algorithm],
    optimizers=[optimizer],
)
class composer.algorithms.sam.sam.SAMOptimizer(base_optimizer, rho=0.05, epsilon=1e-12, interval=1, **kwargs)[source]#

Bases: torch.optim.optimizer.Optimizer

Wraps an optimizer with sharpness-aware minimization (Foret et al, 2020). See SAM for details.

Implementation based on https://github.com/davda54/sam

Parameters
  • base_optimizer (Optimizer) โ€“

  • rho (float, optional) โ€“ The SAM neighborhood size. Must be greater than 0. Default: 0.05.

  • epsilon (float, optional) โ€“ A small value added to the gradient norm for numerical stability. Default: 1.0e-12.

  • interval (int, optional) โ€“ SAM will run once per interval steps. A value of 1 will cause SAM to run every step. Steps on which SAM runs take roughly twice as much time to complete. Default: 1.