[HN Gopher] The potential of transformers in reinforcement learning
___________________________________________________________________
The potential of transformers in reinforcement learning
Author : beefman
Score : 54 points
Date : 2021-12-19 18:59 UTC (1 days ago)
(HTM) web link (lorenzopieri.com)
(TXT) w3m dump (lorenzopieri.com)
| Buttons840 wrote:
| What's a good introduction to transformers?
| tchalla wrote:
| I have seen a lot of introduction that explains the mechanics.
| However, I haven't seen one that explains the intuition of the
| hypothesis on why it works.
| lalaithion wrote:
| Not sure if this is a good introduction, but a good second
| paper to read is https://arxiv.org/abs/2106.06981
|
| You can think of finite state machines as being two functions:
| f(input, state) = output, and g(input, state) = next_state.
| (Traditional FSMs have 3 'output' states, basically 'terminated
| - success', 'terminated - failure', and 'still working', but in
| theory it makes sense to fully generalize it).
|
| If you think about plain neural networks as approximating
| arbitrary functions f(input) = output, then recurrent neural
| networks are "continuous state machines", where you have the
| same two functions f(input, state) = output, and g(input,
| state) = next_state, except instead of being finite symbols,
| they're continuous points in N dimensional space. This, at
| least to me, clarifies why recurrent neural networks work on
| simple and short time-based problems, but can't efficiently
| generalize to complex problems--they're just FSMs!
|
| The paper I linked above provides a similar high-level
| computational analogy to how transformers work.
| atty wrote:
| If you mean the technical details of attention models, the
| original paper "Attention is all you need" is not too difficult
| to read. If you're more interested in applications, hugging
| face has a "course" on their website that walks through the
| high level topics of applying transformers to natural language
| processing (can't remember if they cover transformers for other
| topics).
| criticaltinker wrote:
| The original paper that introduced the Transformer architecture
| is quite accessible and outlines a lot of the history and
| rational for the design [1].
|
| [1] https://arxiv.org/pdf/1706.03762.pdf
| mrfusion wrote:
| I tried that but it seems to gloss over what an encoder, etc
| actually are.
|
| I think I'd do better with pseudo code or a toy example.
| saynay wrote:
| The encoder is the neural-net that converts the input to
| the embedding vector. The decoder is the neural-net that
| converts that vector into output. What that embedding
| vector "means" is whatever the entire algorithm has learned
| it means.
|
| For more simplified look at embeddings, I would look at
| Word2Vec (although, it doesn't involve transformers). It
| encodes single words, instead of entire phrases, and does
| so by looking at their relative position to other words
| while being trained.
|
| Embeddings are just vectors, and so you can do math or
| compare them to other embeddings. The famous example is
| E(king) - E(man) + E(woman) = E(queen)
| mrfusion wrote:
| So you're saying the encoding could be a neural net OR
| something like word2vec?
| criticaltinker wrote:
| Check out The Annotated Transformer, it's one of my
| favorite references! It contains straightforward python
| code side by side with excerpts from the original paper.
|
| http://nlp.seas.harvard.edu/2018/04/03/attention.html
| beefman wrote:
| Transformers from Scratch
|
| link: https://e2eml.school/transformers.html
|
| discussion here: https://news.ycombinator.com/item?id=29315107
| dpflan wrote:
| Another resource: "The Illustrated Transformer"
|
| - https://jalammar.github.io/illustrated-transformer/
|
| - HN post for the article:
| https://news.ycombinator.com/item?id=18351674
| visarga wrote:
| For accessibility I recommend Yannic Kilcher video review of
| "Attention Is All You Need"
|
| https://www.youtube.com/watch?v=iDulhoQ2pro
|
| Yannic has been making about 62 other transformer paper reviews
| since. You can find the usual suspects.
|
| https://www.youtube.com/watch?v=u1_qMdb0kYU&list=PL1v8zpldgH...
| moffkalast wrote:
| Transformers (2007)
| timy2shoes wrote:
| I prefer to go to the original source, specifically The
| Transformers (1984-87).
| visarga wrote:
| So transformers have done it again, another sub-field of ML with
| all its past approaches surpassed by a simple language model, at
| least when there is enough data.
|
| So far they can handle: text, image, video, code, proteins and
| now planning and behavior. It's like a universal algorithm for
| learning and reminds me of the uniformity of the brain. Hope
| we're going to see much more efficient hardware implementations
| in the future.
| blovescoffee wrote:
| I wouldn't say they've "done it" quite yet. There's definitely
| an application for imitation learning but that might be it. A
| translation of the work in sequence-to-sequence to sequence-to-
| action is something I've also considered researching. A few
| challenges exist which the author touches on in just one
| sentence. First, we need data about previous sequences of
| actions and this is necessarily a challenge in many fields in
| robotics/learning. A related problem is that of exploration.
| How exactly should we inform the exploration of new sequences?
| Also, if our policy is based on the prediction of a
| Transformer, does it have the traditional desirable properties
| of a policy in an RL environment? Off the top of my head it
| seems like a Transformer fed into an MLP would probably be fit
| but I'm not sure. Transformers do seem promising, but it's a
| bit early to say they've "done it" :)
___________________________________________________________________
(page generated 2021-12-20 23:01 UTC)