composer.algorithms.functional.apply_seq_length_warmup

composer.algorithms.functional.apply_seq_length_warmup(batch: Dict[str, Tensor], curr_seq_len: int, truncate: bool) Union[Tuple[Union[Tensor, Tuple[Tensor, ...], List[Tensor]], Union[Tensor, Tuple[Tensor, ...], List[Tensor]]], List[Tensor], Dict[str, Tensor], Tensor][source]

Progressively increases the sequence length during training.

Changes the sequence length of all tensors in the provided dictionary to curr_seq_len, by either truncating the tensors (truncate=True) or reshaping the tensors to create new examples from the extra tokens (truncate=False).

The schedule for curr_seq_len over training time should be managed out of this function.

Parameters
  • batch – The input batch to the model, must be a dictionary.

  • curr_seq_length (int) – The desired sequence length to apply.

  • truncate (bool) – Truncate sequences early, or reshape tensors to create new examples out of the extra tokens.

Returns

batch – a Mapping of input tensors to the model, where all tensors have curr_seq_len in the second dimension.