speed_monitor#

Monitor throughput during training.

Classes

SpeedMonitor

Logs the training throughput.

class composer.callbacks.speed_monitor.SpeedMonitor(window_size=100)[source]#

Bases: composer.core.callback.Callback

Logs the training throughput.

The training throughput in terms of number of samples per second is logged on the Event.BATCH_END event if we have reached the window_size threshold.

The wall clock train time is logged on every Event.BATCH_END event.

The average throughout over an epoch is logged on the Event.EPOCH_END event.

Example

>>> from composer import Trainer
>>> from composer.callbacks import SpeedMonitor
>>> # constructing trainer object with this callback
>>> trainer = Trainer(
...     model=model,
...     train_dataloader=train_dataloader,
...     eval_dataloader=eval_dataloader,
...     optimizers=optimizer,
...     max_duration='1ep',
...     callbacks=[SpeedMonitor(window_size=100)],
... )

The training throughput is logged by the Logger to the following keys as described below.

Key

Logged data

throughput/samples_per_sec

Rolling average (over window_size most recent batches) of the number of samples processed per second

wall_clock/train

Total elapsed training time

wall_clock/val

Total elapsed validation time

wall_clock/total

Total elapsed time (wall_clock/train + wall_clock/val)

Parameters

window_size (int, optional) โ€“ Number of batches to use for a rolling average of throughput. Defaults to 100.