utils#

Module utils.

Functions

`register_alibi`	Adds ALiBi's linear attention biases as a buffer to the module.
`zero_and_freeze_expand_position_embeddings`	Replaces weights with zero tensor and prevents them from being learned further.

Classes

`PolicyRegistry`	A registry mapping for ALiBi surgery.
`attrgetter`	attrgetter(attr, ...) --> attrgetter object

Attributes

AlibiReplacementFunction
Callable
Dict
Optional
Type
log
policy_registry

class composer.algorithms.alibi.attention_surgery_functions.utils.PolicyRegistry[source]#

Bases: Dict[Type[torch.nn.modules.module.Module], Callable[[torch.nn.modules.module.Module, int, int], Optional[torch.nn.modules.module.Module]]]

A registry mapping for ALiBi surgery.

register(*modules)[source]#

This decorator registers mappings from torch module types to their ALiBi surgery functions.

To accommodate the specifics of composer’s module surgery, our ALiBi implementation uses a registry to create a Mapping[torch.nn.Module, AlibiReplacementFunction], where AlibiReplacementFunction is any function that has a ReplacementFunction signature but with an additional max_sequence_length argument.

Implementation files (e.g., _gpt2.py) populate policy_registry (an instance of this class) by defining instances of AlibiReplacementFunction functions and decorating them with policy_registry.register() (this method). One or more Type[torch.nn.Module] source classes must be supplied as inputs to the decorator, which tells policy_registry to map those classes to the decorated function.

Example:

from composer.algorithms.alibi.attention_surgery_functions.utils import policy_registry
from transformers.models.gpt2.modeling_gpt2 import GPT2Attention

@policy_registry.register(GPT2Attention)
def convert_gpt2_attention(module: torch.nn.Module, index: int, max_sequence_length: int):
    # Do surgery (change ``module`` or generate a new ``module`` instance to return)
    # Note that this function should depend on ``max_sequence_length``

    # YOUR CODE HERE

    return module

In the above example, convert_gpt2_attention (an instance of a AlibiReplacementFunction function) is decorated with @policy_registry.register(GPT2Attention). Using the decorator this way instructs the ALiBi algorithms to apply surgery to any instance of GPT2Attention within the model using convert_gpt2_attention (the decorated function).

Note that convert_gpt2_attention follows the specific signature of an AlibiReplacementFunction. policy_registry.register() will raise an exception if it is used to decorate a function that does not follow this signature. The requirements are: * The function takes 3 input arguments * Argument 1 has type torch.nn.Module * Argument 2 has type int * Argument 3 is named max_sequence_length and has type int

To better understand these requirements, it may be helpful to review composer’s module surgery (composer/utils/module_surgery.py) and the way ALiBi’s implementation uses policy_registry in composer.algorithms.alibi.apply_alibi().

composer.algorithms.alibi.attention_surgery_functions.utils.register_alibi(module, n_heads, max_token_length, causal)[source]#: Adds ALiBi’s linear attention biases as a buffer to the module.

composer.algorithms.alibi.attention_surgery_functions.utils.zero_and_freeze_expand_position_embeddings(module, max_sequence_length, position_embedding_attribute)[source]#

Replaces weights with zero tensor and prevents them from being learned further.

This is intended to be used specifically for “removing” positional embeddings.