composer.core.time#

Track training progress in terms of epochs, batches, samples, and tokens.

Callbacks, algorithms, and schedulers can use the current training time to fire at certain points in the training process.

The Timer class tracks the total number of epochs, batches, samples, and tokens. The trainer is responsible for updating the Timer at the end of every epoch and batch. There is only one instance of the Timer, which is attached to the State.

The Time class represents static durations of training time or points in the training process in terms of a specific TimeUnit enum. The Time class supports comparisons, arithmetic, and conversions.

Functions

cast

Cast a value to a type.

Classes

Generic

Abstract base class for generic types.

NamedTuple

Typed version of namedtuple.

Serializable

Interface for serialization; used by checkpointing.

StringEnum

Base class for Enums containing string values.

Time

Time represents static durations of training time or points in the training process in terms of a TimeUnit enum (epochs, batches, samples, tokens, or duration).

TimeUnit

Units of time for the training process.

Timer

Timer tracks the current training progress, in terms of epochs, batches, samples, and tokens.

Timestamp

Timestamp represents a snapshot of Timer.

TypeVar

Type variable.

Attributes

  • TValue

  • TYPE_CHECKING

  • Union

  • annotations

class composer.core.time.Time(value, unit)[source]#

Bases: Generic[composer.core.time.TValue]

Time represents static durations of training time or points in the training process in terms of a TimeUnit enum (epochs, batches, samples, tokens, or duration).

To construct an instance of Time, you can either:

  1. Use a value followed by a TimeUnit enum or string. For example,

    >>> Time(5, TimeUnit.EPOCH)  # describes 5 epochs.
    Time(5, TimeUnit.EPOCH)
    >>> Time(30_000, "tok")  # describes 30,000 tokens.
    Time(30000, TimeUnit.TOKEN)
    >>> Time(0.5, "dur")  # describes 50% of the training process.
    Time(0.5, TimeUnit.DURATION)
    
  2. Use one of the helper methods. See:

Time supports addition and subtraction with other Time instances that share the same TimeUnit. For example:

>>> Time(1, TimeUnit.EPOCH) + Time(2, TimeUnit.EPOCH)
Time(3, TimeUnit.EPOCH)

Time supports multiplication. The multiplier must be either a number or have units of TimeUnit.DURATION. The multiplicand is scaled, and its units are kept.

>>> Time(2, TimeUnit.EPOCH) * 0.5
Time(1, TimeUnit.EPOCH)
>>> Time(2, TimeUnit.EPOCH) * Time(0.5, TimeUnit.DURATION)
Time(1, TimeUnit.EPOCH)

Time supports division. If the divisor is an instance of Time, then it must have the same units as the dividend, and the result has units of TimeUnit.DURATION. For example:

>>> Time(4, TimeUnit.EPOCH) / Time(2, TimeUnit.EPOCH)
Time(2.0, TimeUnit.DURATION)

If the divisor is number, then the dividend is scaled, and it keeps its units. For example:

>>> Time(4, TimeUnit.EPOCH) / 2
Time(2, TimeUnit.EPOCH)
Parameters
classmethod from_batch(batch)[source]#

Create a Time with units of TimeUnit.BATCH. Equivalent to Time(batch, TimeUnit.BATCH).

Parameters

batch (int) โ€“ Number of batches.

Returns

Time โ€“ Time instance, in batches.

classmethod from_duration(duration)[source]#

Create a Time with units of TimeUnit.DURATION. Equivalent to Time(duration, TimeUnit.DURATION).

Parameters

duration (float) โ€“ Duration of the training process. Should be on [0, 1) where 0 represents the beginning of the training process and 1 represents a completed training process.

Returns

Time โ€“ Time instance, in duration.

classmethod from_epoch(epoch)[source]#

Create a Time with units of TimeUnit.EPOCH. Equivalent to Time(epoch, TimeUnit.EPOCH).

Parameters

epoch (int) โ€“ Number of epochs.

Returns

Time โ€“ Time instance, in epochs.

classmethod from_sample(sample)[source]#

Create a Time with units of TimeUnit.SAMPLE. Equivalent to Time(sample, TimeUnit.SAMPLE).

Parameters

sample (int) โ€“ Number of samples.

Returns

Time โ€“ Time instance, in samples.

classmethod from_timestring(timestring)[source]#

Parse a time string into a Time instance. A time string is a numerical value followed by the value of a TimeUnit enum. For example:

>>> Time.from_timestring("5ep")  # describes 5 epochs.
Time(5, TimeUnit.EPOCH)
>>> Time.from_timestring("3e4tok")  # describes 30,000 tokens.
Time(30000, TimeUnit.TOKEN)
>>> Time.from_timestring("0.5dur")  # describes 50% of the training process.
Time(0.5, TimeUnit.DURATION)
Returns

Time โ€“ An instance of Time.

classmethod from_token(token)[source]#

Create a Time with units of TimeUnit.TOKEN. Equivalent to Time(sample, TimeUnit.TOKEN).

Parameters

token (int) โ€“ Number of tokens.

Returns

Time โ€“ Time instance, in tokens.

property unit#

The unit of the time.

property value#

The value of the time, as a number.

class composer.core.time.TimeUnit(value)[source]#

Bases: composer.utils.string_enum.StringEnum

Units of time for the training process.

EPOCH#

Epochs.

Type

str

BATCH#

Batchs (i.e. number of optimization steps)

Type

str

SAMPLE#

Samples.

Type

str

TOKEN#

Tokens. Applicable for natural language processing (NLP) models.

Type

str

DURATION#

Fraction of the training process complete, on [0.0, 1.0)

Type

str

class composer.core.time.Timer[source]#

Bases: composer.core.serializable.Serializable

Timer tracks the current training progress, in terms of epochs, batches, samples, and tokens.

property batch#

The current batch.

property batch_in_epoch#

The number of batches seen in the current epoch (resets at 0 at the beginning of every epoch).

property epoch#

The current epoch.

get(unit)[source]#

Returns the current time in the specified unit.

Parameters

unit (str or TimeUnit) โ€“ The desired unit.

Returns

Time โ€“ The current time, in the specified unit.

get_timestamp()[source]#

Returns a snapshot of the current time.

Unlike the Timer, the values in a Timestamp are a snapshot and are NOT incremented as training progresses.

Returns

Timestamp โ€“ A snapshot of the current training time.

on_batch_complete(samples=0, tokens=0)[source]#

Called by the trainer at the end of every optimization batch.

Note

For accurate time tracking, the trainer is responsible for accumulating the total number of samples and/or tokens trained across all ranks before invoking this function.

Parameters
  • samples (int or Time, optional) โ€“ The number of samples trained in the batch. Defaults to 0.

  • tokens (int or Time, optional) โ€“ The number of tokens trained in the batch. Defaults to 0.

on_epoch_complete()[source]#

Called by the trainer at the end of an epoch.

property sample#

The current sample.

property sample_in_epoch#

The number of samples seen in the current epoch (resets at 0 at the beginning of every epoch).

property token#

The current token.

property token_in_epoch#

The number of tokens seen in the current epoch (resets at 0 at the beginning of every epoch).

class composer.core.time.Timestamp(epoch, batch, batch_in_epoch, sample, sample_in_epoch, token, token_in_epoch)[source]#

Bases: tuple

Timestamp represents a snapshot of Timer.

It is returned from a call to Timer.get_timestamp().

Unlike the Timer, the values in a Timestamp are a snapshot and are NOT incremented as training progresses.

Note

Timestamp should not be instantiated directly; instead use Timer.get_timestamp().