RankZeroLoggerBackend
- class composer.core.logging.base_backend.RankZeroLoggerBackend[source]
Bases:
composer.core.logging.base_backend.BaseLoggerBackend
,composer.core.callback.RankZeroCallback
,abc.ABC
Base class for logging backends that run only on the rank zero process.
In a multi-process training setup (e.g. when using DistributedDataParallel), some logging backends require that only the rank zero process log data. For example, when logging to a file, only the main process should open the file and save data.
When using this class, override
_will_log()
,_log_metric()
, and_training_start()
instead ofwill_log()
,log_metric()
, andtraining_start()
, respectively.This class ensures that
_log_metric()
and_training_start()
are invoked only on the rank zero process.It caputres all logged data before the global rank is available. On the rank zero process, during the
TRAINING_START
event (which occurs after the global rank is set), it routes all captured logged data to_log_metric()
. For other processes, the captured log data is eventually discarded.- _will_log(state: State, log_level: LogLevel) bool [source]
Called by the
Logger
to determine whether the logging backend will log a metric.By default, it always returns
True
, but this method can be overridden.
- _log_metric(epoch: int, step: int, log_level: LogLevel, data: TLogData) None [source]
Called by the
Logger
for metrics wherewill_log()
returnedTrue
.The logging backend should override this function to log the data (e.g. write it to a file, send it to a server, etc…).
- _training_start(state: State, logger: Logger) None [source]
Callback called on the
TRAINING_START
event.
- final log_metric(epoch: int, step: int, log_level: LogLevel, data: TLogData) None [source]
Called by the
Logger
for metrics wherewill_log()
returnedTrue
.The logging backend should override this function to log the data (e.g. write it to a file, send it to a server, etc…).
- final training_start(state: State, logger: Logger) None [source]
Called on the
TRAINING_START
event.