[HN Gopher] Mastering diverse domains through world models
___________________________________________________________________
Mastering diverse domains through world models
Author : braymundo
Score : 58 points
Date : 2023-01-11 14:25 UTC (8 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| jmeister wrote:
| Man I was hoping this wasn't from DeepMind.. come one, one
| cutting edge paper, one!
| schizo89 wrote:
| So this in interesting approach when a model-free algorithm would
| try to learn a simulator (state->state transition model).
|
| Danijar has DreamerV2 and robot-dog application of Dreamer on his
| github, for those interested in implementation.
| ipnon wrote:
| What's the fastest way to grok this paper with an undergraduate-
| level understanding of deep learning? What is the go-to resource
| for reinforcement learning today? The field is quickly changing.
| machiaweliczny wrote:
| I suggest just reading it carefully and noting what you don't
| understand and working from here. You can use chatGPT[0] and
| explain paper[1] to assist you. Or ask on some relevant Discord
| channels.
|
| First topic: AutoEncoder
|
| An autoencoder is a type of neural network architecture that is
| typically used for dimensionality reduction or feature
| learning. It is composed of two main parts: an encoder, which
| maps the input data to a lower-dimensional representation, and
| a decoder, which maps the lower-dimensional representation back
| to the original dimension. The goal of the autoencoder is to
| learn a compressed representation of the input data that
| captures the most important features, while discarding the less
| important ones. This is achieved by training the network to
| reconstruct the input data as accurately as possible, typically
| using a mean squared error or cross-entropy loss function.
|
| Ok, so autoencoder means program that does: K = compress(state)
| state' = decompress(K)
|
| You teach it by minimising: state - state', the goal is for it
| to learn K that's useful knowledge about data.
|
| Next to solve: "What is Recurrent State-Space Model in ML?"
|
| ChatGPT: A Recurrent State-Space Model (RSSM) is a type of
| statistical model that is often used for time series data
| analysis. It is a combination of a Recurrent Neural Network
| (RNN) and a State-Space Model (SSM).
|
| A SSM is a mathematical model that describes the evolution of a
| system over time in terms of a set of state variables. The
| state variables are typically considered hidden variables,
| meaning that they cannot be directly observed, but their values
| can be inferred from observations of the system. A SSM is often
| used to model dynamic systems, such as physical or biological
| systems.
|
| A RNN is a type of neural network architecture that is
| particularly well-suited for processing sequential data, such
| as time series or sequences of words. The key feature of an RNN
| is that it contains a set of recurrent connections, which allow
| information to flow through the network across multiple time
| steps.
|
| An RSSM combines the strengths of RNN and SSM. The RNN is
| responsible for handling sequential, temporal dependencies in
| the data, and the SSM is responsible for inferring the
| underlying state variables of the system. These state variables
| are often considered as the internal representation of the
| data, which is compressed form of data, the state-space can be
| considered as low-dimensional encoding of the data.
|
| An RSSM can be used for a wide range of tasks, such as
| forecasting, prediction, and control of dynamic systems,
| sequential decision making, and generative modelling of time
| series data.
|
| > Can you explain how each of those work in layman terms and
| give pseudo code examples?
|
| ....
|
| [0] https://chat.openai.com/chat
|
| [1] https://www.explainpaper.com/
| phailhaus wrote:
| ChatGPT is a chatbot, not a resource. I would avoid using it
| to learn topics because there is no guarantee that anything
| it says is actually correct.
| schizo89 wrote:
| AI is huge, but this one I think is about Reinforcement
| Learning. Udacity has a nice paced course that's possible to
| complete in 2-3 months without mental overload. It's focused
| on Reinforcement Learning problem definition, Dynamic
| Programming and hands on with foundational algorithms such as
| DQN, PPO, SAC and Monte-Carlo Tree Search.
|
| Auto-encoders, Transformers and Recurrent models don't pop up
| much in RL except for advanced topics and cutting-edge
| research.
___________________________________________________________________
(page generated 2023-01-11 23:01 UTC)