[HN Gopher] A Revolution in How Robots Learn
       ___________________________________________________________________
        
       A Revolution in How Robots Learn
        
       Author : jsomers
       Score  : 32 points
       Date   : 2024-11-26 12:08 UTC (10 hours ago)
        
 (HTM) web link (www.newyorker.com)
 (TXT) w3m dump (www.newyorker.com)
        
       | x11antiek wrote:
       | https://archive.is/fsuxe
        
       | Animats wrote:
       | A research result reported before, but, as usual, the New Yorker
       | has better writers.
       | 
       | Is there something which shows what the tokens they use look
       | like?
        
       | codr7 wrote:
       | Oh my, that has to be one of the worst jobs ever invented.
        
       | m_ke wrote:
       | I did a review of state of the art in robotics recently in prep
       | for some job interviews and the stack is the same as all other ML
       | problems these days, take a large pretrained multi modal model
       | and do supervised fine tuning of it on your domain data.
       | 
       | In this case it's "VLA" as in Vision Language Action models,
       | where a multimodal decoder predicts action tokens and "behavior
       | cloning" is a fancy made up term for supervised learning, because
       | all of the RL people can't get themselves to admit that
       | supervised learning works way better than reinforcement learning
       | in the real world.
       | 
       | Proper imitation learning where a robot learns from 3rd person
       | view of humans doing stuff does not work yet, but some people in
       | the field like to pretend that teleoperation and "behavior
       | cloning" is a form of imitation learning.
        
       | josefritzishere wrote:
       | There's a big asterisk on the word "learn" in that headline.
        
       | ratedgene wrote:
       | Hey, I wonder if we can use LLMs to learn learning patterns, I
       | guess the bottleneck would be the curse of dimensionality when it
       | comes to real world problems, but I think maybe (correct me if
       | I'm wrong) geographic/domain specific attention networks could be
       | used.
       | 
       | Maybe it's like:
       | 
       | 1. Intention, context 2. Attention scanning for components 3.
       | Attention network discovery 4. Rescan for missing components 5.
       | If no relevant context exists or found 6. Learned parameters are
       | initially greedy 7. Storage of parameters gets reduced over time
       | by other contributors
       | 
       | I guess this relies on there being the tough parts: induction,
       | deduction, abductive reasoning.
       | 
       | Can we fake reasoning to test hypothesis that alter the weights
       | of whatever model we use for reasoning?
        
         | ratedgene wrote:
         | Maybe I'm just complicating unsupervised reinforcement
         | learning, and adding central authorities for domain specific
         | models.
        
       ___________________________________________________________________
       (page generated 2024-11-26 23:00 UTC)