RankZeroLoggerBackend

class composer.core.logging.base_backend.RankZeroLoggerBackend[source]

Bases: composer.core.logging.base_backend.BaseLoggerBackend, composer.core.callback.RankZeroCallback, abc.ABC

Base class for logging backends that run only on the rank zero process.

In a multi-process training setup (e.g. when using DistributedDataParallel), some logging backends require that only the rank zero process log data. For example, when logging to a file, only the main process should open the file and save data.

When using this class, override _will_log(), _log_metric(), and _training_start() instead of will_log(), log_metric(), and training_start(), respectively.

This class ensures that _log_metric() and _training_start() are invoked only on the rank zero process.

It caputres all logged data before the global rank is available. On the rank zero process, during the TRAINING_START event (which occurs after the global rank is set), it routes all captured logged data to _log_metric(). For other processes, the captured log data is eventually discarded.

_will_log(state: State, log_level: LogLevel) bool[source]

Called by the Logger to determine whether the logging backend will log a metric.

By default, it always returns True, but this method can be overridden.

Parameters
  • state (State) – The global state object.

  • log_level (LogLevel) – The log level.

Returns
  • bool – Whether to log a metric call, given the

  • :class:`~composer.core.state.State` and

  • :class:`~composer.core.logging.logger.LogLevel`.

_log_metric(epoch: int, step: int, log_level: LogLevel, data: TLogData) None[source]

Called by the Logger for metrics where will_log() returned True.

The logging backend should override this function to log the data (e.g. write it to a file, send it to a server, etc…).

Parameters
  • epoch (int) – The epoch for the logged data.

  • step (int) – The global step for the logged data.

  • log_level (LogLevel) –

  • data (TLogData) – The metric to log.

_training_start(state: State, logger: Logger) None[source]

Callback called on the TRAINING_START event.

Parameters
  • state (State) – The global state.

  • logger (Logger) – The global logger.

final log_metric(epoch: int, step: int, log_level: LogLevel, data: TLogData) None[source]

Called by the Logger for metrics where will_log() returned True.

The logging backend should override this function to log the data (e.g. write it to a file, send it to a server, etc…).

Parameters
  • epoch (int) – The epoch for the logged data.

  • step (int) – The global step for the logged data.

  • log_level (LogLevel) – The log level.

  • data (TLogData) – The metric to log.

final training_start(state: State, logger: Logger) None[source]

Called on the TRAINING_START event.

Parameters
  • state (State) – The global state.

  • logger (Logger) – The logger.

final will_log(state: State, log_level: LogLevel) bool[source]

Called by the Logger to determine whether to log a metric.

By default, it always returns True, but this method can be overridden.

Parameters
  • state (State) – The global state object.

  • log_level (LogLevel) – The log level

Returns
  • bool – Whether to log a metric call, given the

  • :class:`~composer.core.state.State` and

  • :class:`~composer.core.logging.logger.LogLevel`.