๐ Events#
Events represent specific points in the training loop where an Algorithm
and
Callback
can run.
- class composer.core.Event(value)[source]
Enum to represent events in the training loop.
The following pseudocode shows where each event fires in the training loop:
# <INIT> # <FIT_START> for epoch in range(NUM_EPOCHS): # <EPOCH_START> for batch in dataloader: # <AFTER_DATALOADER> # <BATCH_START> # <BEFORE_TRAIN_BATCH> for microbatch in batch.split(grad_accum): # <BEFORE_FORWARD> outputs = model(batch) # <AFTER_FORWARD> # <BEFORE_LOSS> loss = model.loss(outputs, batch) # <AFTER_LOSS> # <BEFORE_BACKWARD> loss.backward() # <AFTER_BACKWARD> # Un-scale and clip gradients # <AFTER_TRAIN_BATCH> optimizer.step() # <BATCH_END> if should_eval(batch=True): for eval_dataloader in eval_dataloaders: # <EVAL_START> for batch in eval_dataloader: # <EVAL_BATCH_START> # <EVAL_BEFORE_FORWARD> outputs, targets = model(batch) # <EVAL_AFTER_FORWARD> metrics.update(outputs, targets) # <EVAL_BATCH_END> # <EVAL_END> # <BATCH_CHECKPOINT> # <EPOCH_END> if should_eval(batch=False): for eval_dataloader in eval_dataloaders: # <EVAL_START> for batch in eval_dataloader: # <EVAL_BATCH_START> # <EVAL_BEFORE_FORWARD> outputs, targets = model(batch) # <EVAL_AFTER_FORWARD> metrics.update(outputs, targets) # <EVAL_BATCH_END> # <EVAL_END> # <EPOCH_CHECKPOINT> # <FIT_END>
- INIT
Invoked in the constructor of
Trainer
. Model surgery (seemodule_surgery
) typically occurs here.
- FIT_START
Invoked at the beginning of each call to
Trainer.fit()
. Dataset transformations typically occur here.
- EPOCH_START
Start of an epoch.
- AFTER_DATALOADER
Immediately after the dataloader is called. Typically used for on-GPU dataloader transforms.
- BATCH_START
Start of a batch.
- BEFORE_TRAIN_BATCH
Before the forward-loss-backward computation for a training batch. When using gradient accumulation, this is still called only once.
- BEFORE_FORWARD
Before the call to
model.forward()
. This is called multiple times per batch when using gradient accumulation.
- AFTER_FORWARD
After the call to
model.forward()
. This is called multiple times per batch when using gradient accumulation.
- BEFORE_LOSS
Before the call to
model.loss()
. This is called multiple times per batch when using gradient accumulation.
- AFTER_LOSS
After the call to
model.loss()
. This is called multiple times per batch when using gradient accumulation.
- BEFORE_BACKWARD
Before the call to
loss.backward()
. This is called multiple times per batch when using gradient accumulation.
- AFTER_BACKWARD
After the call to
loss.backward()
. This is called multiple times per batch when using gradient accumulation.
- AFTER_TRAIN_BATCH
After the forward-loss-backward computation for a training batch. When using gradient accumulation, this event still fires only once.
- BATCH_END
End of a batch, which occurs after the optimizer step and any gradient scaling.
- BATCH_CHECKPOINT
After
Event.BATCH_END
and any batch-wise evaluation. Saving checkpoints at this event allows the checkpoint saver to use the results from any batch-wise evaluation to determine whether a checkpoint should be saved.
- EPOCH_END
End of an epoch.
- EPOCH_CHECKPOINT
After
Event.EPOCH_END
and any epoch-wise evaluation. Saving checkpoints at this event allows event allows the checkpoint saver to use the results from any epoch-wise evaluation to determine whether a checkpointshould be saved.
- FIT_END
Invoked at the end of each call to
Trainer.fit()
. This event exists primarily for logging information and flushing callbacks. Algorithms should not transform the training state on this event, as any changes will not be preserved in checkpoints.
- EVAL_START
Start of evaluation through the validation dataset.
- EVAL_BATCH_START
Before the call to
model.validate(batch)
- EVAL_BEFORE_FORWARD
Before the call to
model.validate(batch)
- EVAL_AFTER_FORWARD
After the call to
model.validate(batch)
- EVAL_BATCH_END
After the call to
model.validate(batch)
- EVAL_END
End of evaluation through the validation dataset.