gated_linear_units#
Modules
|
Module |
Module |
Replaces the Linear layers in the feed-forward network with Gated Linear Units.
This leads to improved convergence with a slight drop in throughput. Using no bias terms in the GLU is highly recommended.
See the Method Card for more details.
Functions
Replaces the Linear layers in the feed-forward network with Gated Linear Units. |
Classes
Replaces all instances of Linear layers in the feed-forward subnetwork with a Gated Linear Unit. |