[HN Gopher] DeepNet: Scaling Transformers to 1k Layers
___________________________________________________________________
DeepNet: Scaling Transformers to 1k Layers
Author : homarp
Score : 4 points
Date : 2022-03-02 22:10 UTC (50 minutes ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
___________________________________________________________________
(page generated 2022-03-02 23:00 UTC)