composer.Event

Events represent specific points in the training loop where a Algorithm and Callback can run.

Note

By convention, Callback should not be modifying the state, and are used for non-essential reporting functions such as logging or timing. Methods that need to modify state should be Algorithm.

Events List

Available events include:

Name	Description
`INIT`	Immediately after `model` initialization, and before creation of `optimizers` and `schedulers`. Model surgery typically occurs here.
`TRAINING_START`	Start of training. For multi-GPU training, runs after the DDP process fork.
`EPOCH_START`, `EPOCH_END`	Start and end of an Epoch.
`BATCH_START`, `BATCH_END`	Start and end of a batch, inclusive of the optimizer step and any gradient scaling.
`AFTER_DATALOADER`	Immediately after the dataloader is called. Typically used for on-GPU dataloader transforms.
`BEFORE_TRAIN_BATCH`, `AFTER_TRAIN_BATCH`	Before and after the forward-loss-backward computation for a training batch. When using gradient_accumulation, these are still called only once.
`BEFORE_FORWARD`, `AFTER_FORWARD`	Before and after the call to `model.forward()`
`BEFORE_LOSS`, `AFTER_LOSS`	Before and after the loss computation.
`BEFORE_BACKWARD`, `AFTER_BACKWARD`	Before and after the backward pass.
`TRAINING_END`	End of training.
`EVAL_START`, `EVAL_END`	Start and end of evaluation through the validation dataset.
`EVAL_BATCH_START`, `EVAL_BATCH_END`	Before and after the call to `model.validate(batch)`
`EVAL_BEFORE_FORWARD`, `EVAL_AFTER_FORWARD`	Before and after the call to `model.validate(batch)`

Training Loop

For a conceptual understanding of when events are run within the trainer, see the below pseudo-code outline:

model = your_model()
<INIT>  # model surgery here
optimizers = SGD(model.parameters(), lr=0.01)
schedulers = CosineAnnealing(optimizers, T_max=90)

ddp.launch()  # for multi-GPUs, processes are forked here
<TRAINING_START>  # has access to process rank for DDP

for epoch in range(90):
    <EPOCH_START>

    for batch in dataloader:
        <AFTER_DATALOADER>
        <BATCH_START>

        #-- closure: forward/backward/loss -- #
        <BEFORE_TRAIN_BATCH>

        # for gradient accumulation
        for microbatch in batch:
            <BEFORE_FORWARD>
            outputs = model.forward(microbatch)
            <AFTER_FORWARD>
            <BEFORE_LOSS>
            loss = model.loss(outputs, microbatch)
            <AFTER_LOSS>
            <BEFORE_BACKWARD>
            loss.backward()
            <AFTER_BACKWARD>

        gradient_unscaling()  # for mixed precision
        gradient_clipping()
        <AFTER_TRAIN_BATCH>
        # -------------------------- #

        optimizer.step() # grad scaling (AMP) also

        <BATCH_END>
        scheduler.step('step')
        maybe_eval()

    scheduler.step('epoch')
    maybe_eval()
    <EPOCH_END>

<TRAINING_END>

def maybe_eval():
    <EVAL_START>

    for batch in eval_dataloader:
        <EVAL_BATCH_START>

        <EVAL_BEFORE_FORWARD>
        outputs, targets = model.validate(batch)
        <EVAL_AFTER_FORWARD>

        metrics.update(outputs, targets)
        <EVAL_BATCH_END>

    <EVAL_END>

Note

Several events occur right after each other (e.g. AFTER_DATALOADER and BATCH_START). We keep these separate because algorithms/callbacks may want to run, for example, after all the dataloader transforms.

API Reference

class composer.core.event.Event(value)[source]

An event that occurs during the execution of the training and evaluation loops.

For a conceptual understanding of what events are run within the trainer, see the below pseudo-code outline:

``` model = your_model() <INIT> # model surgery here optimizers = SGD(model.parameters(), lr=0.01) schedulers = CosineAnnealing(optimizers, T_max=90)

ddp.launch() # for multi-GPUs, processes are forked here <TRAINING_START> # has access to process rank for DDP

for epoch in range(90):

<EPOCH_START>

for batch in dataloader:

<AFTER_DATALOADER> <BATCH_START>

#– closure: forward/backward/loss – # <BEFORE_TRAIN_BATCH>

# for gradient accumulation for microbatch in batch:

<BEFORE_FORWARD> outputs = model.forward(microbatch) <AFTER_FORWARD> <BEFORE_LOSS> loss = model.loss(outputs, microbatch) <AFTER_LOSS> <BEFORE_BACKWARD> loss.backward() <AFTER_BACKWARD>

gradient_unscaling() # for mixed precision gradient_clipping() <AFTER_TRAIN_BATCH> # ————————– #

optimizer.step() # grad scaling (AMP) also

<BATCH_END> scheduler.step(‘step’) maybe_eval()

scheduler.step(‘epoch’) maybe_eval() <EPOCH_END>

<TRAINING_END>

def maybe_eval():

<EVAL_START>

for batch in eval_dataloader:

<EVAL_BATCH_START>

<EVAL_BEFORE_FORWARD> outputs, targets = model.validate(batch) <EVAL_AFTER_FORWARD>

metrics.update(outputs, targets) <EVAL_BATCH_END>

<EVAL_END>

```