hngopher.com

       [HN Gopher] Expanding Transformer size without losing function o...
       ___________________________________________________________________
        
       Expanding Transformer size without losing function or starting from
       scratch
        
       Author : og_kalu
       Score  : 24 points
       Date   : 2023-08-18 17:14 UTC (5 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | valine wrote:
       | I'm surprised we weren't doing this already. I'd like to see what
       | happens if you train a small language model on preschool level
       | reading material, and ramp up both the model size and training
       | data complexity as you go. My hope would be you'd need less data
       | to train a model in this fashion than you would with our current
       | approach of throwing the entire internet at the model.
        
         | two_in_one wrote:
         | > I'm surprised we weren't doing this already.
         | 
         | Because they don't work very well in real life. Article doesn't
         | have any experimental proof it works, saves training time, or
         | size, or anything..
         | 
         | And it doesn't mention another promising approach: mix of
         | experts. There are many ways of implementing it. I mean the
         | general idea of specialized fragments of the NN which are
         | selectively called. They don't have to be trained all at once.
        
       | galaxytachyon wrote:
       | Hilarious. Eventually we will create a real sentient AGI that is
       | highly efficient and can do everything a normal human can, but it
       | turns out it needs 20 years of training and too much
       | entertainment needs and it often goes mad when we force it to
       | work on repetitive work too long, i.e. a normal human.
       | 
       | Would be so funny if by the end of all the AI research, we found
       | out the human brain is already the best you can ever get.
        
         | bilsbie wrote:
         | Is that kind of the Star Trek universe?
        
         | Ifkaluva wrote:
         | I would be delighted by this. Sounds like something Douglas
         | Adams would dream up.
        
       | drdeca wrote:
       | [delayed]
        
       ___________________________________________________________________
       (page generated 2023-08-18 23:01 UTC)