[HN Gopher] A Revolution in How Robots Learn
___________________________________________________________________
A Revolution in How Robots Learn
Author : jsomers
Score : 32 points
Date : 2024-11-26 12:08 UTC (10 hours ago)
(HTM) web link (www.newyorker.com)
(TXT) w3m dump (www.newyorker.com)
| x11antiek wrote:
| https://archive.is/fsuxe
| Animats wrote:
| A research result reported before, but, as usual, the New Yorker
| has better writers.
|
| Is there something which shows what the tokens they use look
| like?
| codr7 wrote:
| Oh my, that has to be one of the worst jobs ever invented.
| m_ke wrote:
| I did a review of state of the art in robotics recently in prep
| for some job interviews and the stack is the same as all other ML
| problems these days, take a large pretrained multi modal model
| and do supervised fine tuning of it on your domain data.
|
| In this case it's "VLA" as in Vision Language Action models,
| where a multimodal decoder predicts action tokens and "behavior
| cloning" is a fancy made up term for supervised learning, because
| all of the RL people can't get themselves to admit that
| supervised learning works way better than reinforcement learning
| in the real world.
|
| Proper imitation learning where a robot learns from 3rd person
| view of humans doing stuff does not work yet, but some people in
| the field like to pretend that teleoperation and "behavior
| cloning" is a form of imitation learning.
| josefritzishere wrote:
| There's a big asterisk on the word "learn" in that headline.
| ratedgene wrote:
| Hey, I wonder if we can use LLMs to learn learning patterns, I
| guess the bottleneck would be the curse of dimensionality when it
| comes to real world problems, but I think maybe (correct me if
| I'm wrong) geographic/domain specific attention networks could be
| used.
|
| Maybe it's like:
|
| 1. Intention, context 2. Attention scanning for components 3.
| Attention network discovery 4. Rescan for missing components 5.
| If no relevant context exists or found 6. Learned parameters are
| initially greedy 7. Storage of parameters gets reduced over time
| by other contributors
|
| I guess this relies on there being the tough parts: induction,
| deduction, abductive reasoning.
|
| Can we fake reasoning to test hypothesis that alter the weights
| of whatever model we use for reasoning?
| ratedgene wrote:
| Maybe I'm just complicating unsupervised reinforcement
| learning, and adding central authorities for domain specific
| models.
___________________________________________________________________
(page generated 2024-11-26 23:00 UTC)