[HN Gopher] Mastering diverse domains through world models
       ___________________________________________________________________
        
       Mastering diverse domains through world models
        
       Author : braymundo
       Score  : 58 points
       Date   : 2023-01-11 14:25 UTC (8 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | jmeister wrote:
       | Man I was hoping this wasn't from DeepMind.. come one, one
       | cutting edge paper, one!
        
       | schizo89 wrote:
       | So this in interesting approach when a model-free algorithm would
       | try to learn a simulator (state->state transition model).
       | 
       | Danijar has DreamerV2 and robot-dog application of Dreamer on his
       | github, for those interested in implementation.
        
       | ipnon wrote:
       | What's the fastest way to grok this paper with an undergraduate-
       | level understanding of deep learning? What is the go-to resource
       | for reinforcement learning today? The field is quickly changing.
        
         | machiaweliczny wrote:
         | I suggest just reading it carefully and noting what you don't
         | understand and working from here. You can use chatGPT[0] and
         | explain paper[1] to assist you. Or ask on some relevant Discord
         | channels.
         | 
         | First topic: AutoEncoder
         | 
         | An autoencoder is a type of neural network architecture that is
         | typically used for dimensionality reduction or feature
         | learning. It is composed of two main parts: an encoder, which
         | maps the input data to a lower-dimensional representation, and
         | a decoder, which maps the lower-dimensional representation back
         | to the original dimension. The goal of the autoencoder is to
         | learn a compressed representation of the input data that
         | captures the most important features, while discarding the less
         | important ones. This is achieved by training the network to
         | reconstruct the input data as accurately as possible, typically
         | using a mean squared error or cross-entropy loss function.
         | 
         | Ok, so autoencoder means program that does: K = compress(state)
         | state' = decompress(K)
         | 
         | You teach it by minimising: state - state', the goal is for it
         | to learn K that's useful knowledge about data.
         | 
         | Next to solve: "What is Recurrent State-Space Model in ML?"
         | 
         | ChatGPT: A Recurrent State-Space Model (RSSM) is a type of
         | statistical model that is often used for time series data
         | analysis. It is a combination of a Recurrent Neural Network
         | (RNN) and a State-Space Model (SSM).
         | 
         | A SSM is a mathematical model that describes the evolution of a
         | system over time in terms of a set of state variables. The
         | state variables are typically considered hidden variables,
         | meaning that they cannot be directly observed, but their values
         | can be inferred from observations of the system. A SSM is often
         | used to model dynamic systems, such as physical or biological
         | systems.
         | 
         | A RNN is a type of neural network architecture that is
         | particularly well-suited for processing sequential data, such
         | as time series or sequences of words. The key feature of an RNN
         | is that it contains a set of recurrent connections, which allow
         | information to flow through the network across multiple time
         | steps.
         | 
         | An RSSM combines the strengths of RNN and SSM. The RNN is
         | responsible for handling sequential, temporal dependencies in
         | the data, and the SSM is responsible for inferring the
         | underlying state variables of the system. These state variables
         | are often considered as the internal representation of the
         | data, which is compressed form of data, the state-space can be
         | considered as low-dimensional encoding of the data.
         | 
         | An RSSM can be used for a wide range of tasks, such as
         | forecasting, prediction, and control of dynamic systems,
         | sequential decision making, and generative modelling of time
         | series data.
         | 
         | > Can you explain how each of those work in layman terms and
         | give pseudo code examples?
         | 
         | ....
         | 
         | [0] https://chat.openai.com/chat
         | 
         | [1] https://www.explainpaper.com/
        
           | phailhaus wrote:
           | ChatGPT is a chatbot, not a resource. I would avoid using it
           | to learn topics because there is no guarantee that anything
           | it says is actually correct.
        
           | schizo89 wrote:
           | AI is huge, but this one I think is about Reinforcement
           | Learning. Udacity has a nice paced course that's possible to
           | complete in 2-3 months without mental overload. It's focused
           | on Reinforcement Learning problem definition, Dynamic
           | Programming and hands on with foundational algorithms such as
           | DQN, PPO, SAC and Monte-Carlo Tree Search.
           | 
           | Auto-encoders, Transformers and Recurrent models don't pop up
           | much in RL except for advanced topics and cutting-edge
           | research.
        
       ___________________________________________________________________
       (page generated 2023-01-11 23:01 UTC)