[HN Gopher] Reproducing the deep double descent paper
___________________________________________________________________
Reproducing the deep double descent paper
Author : stpn
Score : 8 points
Date : 2025-06-05 18:34 UTC (4 hours ago)
(HTM) web link (stpn.bearblog.dev)
(TXT) w3m dump (stpn.bearblog.dev)
| davidguetta wrote:
| is this not because the longer you train, the more neurons 'die'
| (not uilized anymore cause the gradient is flat on the dataset)
| so you effectively get a smaller models as the training goes on ?
___________________________________________________________________
(page generated 2025-06-05 23:01 UTC)