utils#
Module utils
.
Functions
Adds ALiBi's linear attention biases as a buffer to the module. |
|
Replaces weights with zero tensor and prevents them from being learned further. |
Classes
A registry mapping for ALiBi surgery. |
|
attrgetter(attr, ...) --> attrgetter object |
Attributes
AlibiReplacementFunction
Callable
Dict
Optional
Type
log
policy_registry
- class composer.algorithms.alibi.attention_surgery_functions.utils.PolicyRegistry[source]#
Bases:
Dict
[Type
[torch.nn.modules.module.Module
],Callable
[[torch.nn.modules.module.Module
,int
,int
],Optional
[torch.nn.modules.module.Module
]]]A registry mapping for ALiBi surgery.
- register(*modules)[source]#
This decorator registers mappings from torch module types to their ALiBi surgery functions.
To accommodate the specifics of composerโs module surgery, our ALiBi implementation uses a registry to create a
Mapping[torch.nn.Module, AlibiReplacementFunction]
, where AlibiReplacementFunction is any function that has aReplacementFunction
signature but with an additionalmax_sequence_length
argument.Implementation files (e.g.,
_gpt2.py
) populatepolicy_registry
(an instance of this class) by defining instances of AlibiReplacementFunction functions and decorating them withpolicy_registry.register()
(this method). One or moreType[torch.nn.Module]
source classes must be supplied as inputs to the decorator, which tellspolicy_registry
to map those classes to the decorated function.Example:
from composer.algorithms.alibi.attention_surgery_functions.utils import policy_registry from transformers.models.gpt2.modeling_gpt2 import GPT2Attention @policy_registry.register(GPT2Attention) def convert_gpt2_attention(module: torch.nn.Module, index: int, max_sequence_length: int): # Do surgery (change ``module`` or generate a new ``module`` instance to return) # Note that this function should depend on ``max_sequence_length`` # YOUR CODE HERE return module
In the above example,
convert_gpt2_attention
(an instance of a AlibiReplacementFunction function) is decorated with@policy_registry.register(GPT2Attention)
. Using the decorator this way instructs the ALiBi algorithms to apply surgery to any instance of GPT2Attention within the model usingconvert_gpt2_attention
(the decorated function).Note that
convert_gpt2_attention
follows the specific signature of an AlibiReplacementFunction.policy_registry.register()
will raise an exception if it is used to decorate a function that does not follow this signature. The requirements are: * The function takes 3 input arguments * Argument 1 has typetorch.nn.Module
* Argument 2 has typeint
* Argument 3 is namedmax_sequence_length
and has typeint
To better understand these requirements, it may be helpful to review composerโs module surgery (
composer/utils/module_surgery.py
) and the way ALiBiโs implementation uses policy_registry incomposer.algorithms.alibi.apply_alibi()
.
- composer.algorithms.alibi.attention_surgery_functions.utils.register_alibi(module, n_heads, max_token_length, causal)[source]#
Adds ALiBiโs linear attention biases as a buffer to the module.
- composer.algorithms.alibi.attention_surgery_functions.utils.zero_and_freeze_expand_position_embeddings(module, max_sequence_length, position_embedding_attribute)[source]#
Replaces weights with zero tensor and prevents them from being learned further.
This is intended to be used specifically for โremovingโ positional embeddings.