composer.algorithms.gated_linear_units.gated_linear_unit_layers#
composer.algorithms.gated_linear_units.gated_linear_unit_layers
Classes
Defines a single feed-forward block that uses Gated Linear Units. |
Attributes
Callable
- class composer.algorithms.gated_linear_units.gated_linear_unit_layers.BERTGatedFFOutput(d_embed, d_ff, dropout_rate, act_fn, layernorm_eps, gated_layer_bias=False, non_gated_layer_bias=False)[source]#
Bases:
torch.nn.modules.module.Module
Defines a single feed-forward block that uses Gated Linear Units.
- Parameters
d_embed (int) โ The input dimension for the feed-forward network.
d_ff (int) โ The hidden dimension for the feed-forward network.
dropout_rate (float) โ The dropout rate to use between the two projection matricies in the feed-forward block.
act_fn (Callable[Tensor, Tensor]) โ The activation function to use in the feed-forward network.
layernorm_eps (float) โ The epsilon term to use in the LayerNorm operator. Useful for when the variance is small.
gated_layer_bias (bool) โ Whether to use a bias term in the gated projection matrix.
non_gated_layer_bias (bool) โ Whether to use a bias term in teh non-gated projection matrix.