[HN Gopher] Ask HN: What were the papers on the list Ilya Sutske...
       ___________________________________________________________________
        
       Ask HN: What were the papers on the list Ilya Sutskever gave John
       Carmack?
        
       John Carmack's new interview on AI/AGI [1] carries a puzzle:  "So I
       asked Ilya Sutskever, OpenAI's chief scientist, for a reading list.
       He gave me a list of like 40 research papers and said, 'If you
       really learn all of these, you'll know 90% of what matters today.'
       And I did. I plowed through all those things and it all started
       sorting out in my head."  What papers do you think were on this
       list?  [1] https://dallasinnovates.com/exclusive-qa-john-carmacks-
       different-path-to-artificial-general-intelligence/
        
       Author : alan-stark
       Score  : 338 points
       Date   : 2023-02-03 14:24 UTC (8 hours ago)
        
       | albertzeyer wrote:
       | (Partly copied from
       | https://news.ycombinator.com/item?id=34640251.)
       | 
       | On models: Obviously, almost everything is Transformer nowadays
       | (Attention is all you need paper). However, I think to get into
       | the field, to get a good overview, you should also look a bit
       | beyond the Transformer. E.g. RNNs/LSTMs are still a must learn,
       | even though Transformers might be better in many tasks. And then
       | all those memory-augmented models, e.g. Neural Turing Machine and
       | follow-ups, are important too.
       | 
       | It also helps to know different architectures, such as just
       | language models (GPT), attention-based encoder-decoder (e.g.
       | original Transformer), but then also CTC, hybrid HMM-NN,
       | transducers (RNN-T).
       | 
       | Some self-promotion: I think my Phd thesis does a good job on
       | giving an overview on this: https://www-i6.informatik.rwth-
       | aachen.de/publications/downlo...
       | 
       | Diffusion models is also another recent different kind of model.
       | 
       | Then, a separate topic is the training aspect. Most papers do
       | supervised training, using cross entropy loss to the ground-truth
       | target. However, there are many others:
       | 
       | There is CLIP to combine text and image modalities.
       | 
       | There is the whole field on unsupervised or self-supervised
       | training methods. Language model training (next label prediction)
       | is one example, but there are others.
       | 
       | And then there is the big field on reinforcement learning, which
       | is probably also quite relevant for AGI.
        
         | [deleted]
        
         | hardware2win wrote:
         | I do wonder whether people behind Attention is all you need
         | paper
         | 
         | Will receive Turing Award
         | 
         | It is being cited often
        
           | mirekrusin wrote:
           | Guy who said - "I don't understand all of this, can we just
           | throw more machines?" should get the award.
        
           | albertzeyer wrote:
           | The authors did not really expect it to be such a huge
           | influence. You could also argue, it is a somewhat natural
           | next step. This paper did not invent self-attention nor
           | attention. Attention was already very popular, specifically
           | for machine translation, and a few other papers already did
           | use self-attention at that point in time. It was just the
           | first paper which solely used attention and self-attention
           | and nothing else.
        
           | RC_ITR wrote:
           | >Will receive Turing Award
           | 
           | This is the weird thing - hopefully not! Hopefully there's
           | even better NN models coming out every 5-10 years and we look
           | back on transformers as 'just a phase' sort of like how we
           | look back at RNN's (which were no less of an amazing
           | achievement - look at the proliferation of voice assistants),
           | as potentially obsolete technology today.
           | 
           | Fore example, attention is great and does a really good job
           | of simulating context in language, but what if we come up
           | with a clever way to simulate symbology? Then we actually are
           | back on the path to AGI and transformers will look like
           | child's play.
        
             | Beldin wrote:
             | > _symbology_
             | 
             | Off-topic, but now I have William Dafoe going "What's the
             | 'symbology' here? The _symbolism_ ... " in my head (from
             | Boondock Saints).
        
               | Gee101 wrote:
               | Even thou I watched that movie 20 years ago. I will never
               | forget that scene.
        
           | mattcaldwell wrote:
           | Came here expecting a Haiku.
        
             | qwertyforce wrote:
             | Neural nets advance,
             | 
             | Attention is all you need,
             | 
             | Computing ascends.
             | 
             | #by chatgpt
        
             | maxbond wrote:
             | The authors who wrote
             | 
             | "Attention is all you need" -
             | 
             | Turing candidates?
        
               | fastball wrote:
               | The people behind
               | 
               | "Attention is all you need"
               | 
               | Are often cited
        
               | andrelaszlo wrote:
               | Attention.
               | 
               | Attention.
               | 
               | Attention.
               | 
               | - Ikkyu
        
           | modeless wrote:
           | The Adam optimizer is another possibility. It's unbelievably
           | good and everyone uses it.
        
           | seydor wrote:
           | I remember an interview with one of the founders of openAI,
           | saying that if it wasn't the transformer architecture it
           | would be something else. What really matters is the scale of
           | the model. The transformer is only one of the possible
           | configurations that work well with text. It seems they stuck
           | to it because it is really so good so why break things.
        
         | alan-stark wrote:
         | Thanks for sharing. Cool to see someone from Aachen NLP group.
         | I'll be visiting Aachen/Dusseldorf/Heidelberg area in spring.
         | Do you know of any local ML meetups open to general (ML
         | engineer/programmer) public?
        
           | albertzeyer wrote:
           | Unfortunately, not really. We used to have some RWTH internal
           | meetups, although that has been somewhat interrupted since
           | Corona, and not really recovered afterwards.
           | 
           | Aachen has quite a few companies with activity on NLP or
           | speech recognition, mostly due to my professor Hermann Ney.
           | E.g. there is Apple, Amazon, Nuance, eBay. And lesser-known
           | AppTek. And in Cologne, you have DeepL. In all those
           | companies, you find many people from our group. And then, at
           | the RWTH Aachen University, you have our NLP/speech group,
           | and also the computer vision group.
        
       | hexhowells wrote:
       | While not all papers, this list contains a lot of important
       | papers, writings, and conversations currently in AI:
       | https://docs.google.com/document/d/1bEQM1W-1fzSVWNbS4ne5PopB...
        
       | querez wrote:
       | A lot of other posts here are biased to recent papers, and papers
       | that had "a big impact", but miss a lot of foundations. I think
       | this reddit post on the most foundational ML papers gives a lot
       | more balanced overview:
       | https://www.reddit.com/r/MachineLearning/comments/zetvmd/d_i...
        
       | cloudking wrote:
       | Ilya's publications may be on the list
       | https://scholar.google.com/citations?user=x04W_mMAAAAJ&hl=en
        
       | mgaunard wrote:
       | In my experience, all deep learning is overhyped, and most needs
       | that are not already addressable by linear regressions can be
       | done so with simple supervised learning.
        
       | optimalsolver wrote:
       | Carmack says he's pursuing a different path to AGI, then goes
       | straight to the guy at the center of the most saturated area of
       | machine learning (deep learning)?
       | 
       | I would've hoped he'd be exploring weirder alternatives off the
       | beaten path. I mean, neural networks might not even be necessary
       | for AGI, but no one at OpenAI is going to tell Carmack that.
        
         | albertzeyer wrote:
         | It is possible to use neural networks and still be on a quite
         | different path than the mainstream.
         | 
         | Of course, there are a group of people defending the symbolic
         | computation, e.g. see Gary Marcus, and always pushing back on
         | connectionism (neural networks).
         | 
         | But this is somewhat a spectrum, or also rather sloppy
         | terminology. Once you go away from symbolic computation, many
         | things can be interpret as neural network. And there is also
         | all the computational neuroscience, which also work with some
         | variants of neural networks.
         | 
         | And there is the human brain, which demonstrates, that a neural
         | network is capable of doing AGI. So why would you not want a
         | neural network? But that does not say that you can do many
         | things very different from mainstream.
        
         | throwaway4837 wrote:
         | Did you read the full article? In science, you should usually
         | have a very solid understanding of what the top minds in the
         | field are fixated on as it allows you to try something
         | different with confidence, and prevents you from pulling a
         | Ramanujan, reinventing the exact same wheel. I can't think of a
         | single scientist who caused a paradigm shift and didn't have an
         | intimate understanding of the current status quo.
        
         | ly3xqhl8g9 wrote:
         | The most off the beaten path to AGI I heard through the
         | grapevine is to not have artificial neural networks, as in
         | algorithms involving matmul running on silicon, at all. But
         | instead, going on the path of the laziest engineer is the best
         | engineer, to rely on the fact that neurons, actual neurons from
         | someone's brain, already "know" how to make efficient, good-
         | enough, general learning architectures and therefore in order
         | to obtain programmatic human-like intelligence one would
         | 'simply'+ have to implant them not in mice [1] but in an actual
         | vat and 'simply' interface with the whatever a group of neurons
         | can be called, a soma(?). Given this Brain-on-a-Chip
         | architecture, we wouldn't have to stick GPUs in our cars to
         | achieve self-driving, but even more wetware (and of course,
         | ignore the occasional screams of dread as the wetware becomes
         | aware of themselves and how condemned they are to an existence
         | of left-right-accelerate-break).
         | 
         | It would have been interesting seeing someone like Carmack
         | going in this direction, but from the little details he gave he
         | seems less interested in cells and Kjeldahl flasks and more of
         | the same type-a-type-a on the ol' QWERTY.
         | 
         | + 'simply' might involve multiple decades of research and
         | Buffett knows how many billions
         | 
         | [1] Human neurons implanted in mice influence behavior,
         | https://www.nature.com/articles/s41586-022-05277-w
        
         | mindcrime wrote:
         | Wouldn't it be fair to say that one has to know what the
         | current path _is_ and have some idea where it leads and what
         | its issues are, before forging a new path?
         | 
         | I mean, any idiot can go off-trail and start blundering around
         | in the weeds, and ultimately wind up tripping, falling, hitting
         | their head on a rock, and drowning to death in a ditch. But
         | actually finding a new, better, more efficient path probably
         | involves at least _some_ understanding of the status quo.
        
           | someweirdperson wrote:
           | To walk a path no knowledge of the existing is needed. But to
           | be able to claim it is new it is. Even more so to be able to
           | claim that the new is better.
        
             | fnordpiglet wrote:
             | Bias and ignorance are two different things. No knowledge
             | is ignorance. Bias is using knowledge to judge new
             | knowledge. The goal isn't to pursue things with raging
             | ignorance but to pursue them with no bias and collecting
             | knowledge without conclusion, then once you're
             | knowledgeable of what is there you can take off with raging
             | ignorance in the direction no one has gone before. But you
             | can't do than holding bias any more than you can having
             | ignorance of what directions have been gone before.
        
           | agar wrote:
           | > probably involves at least some understanding of the status
           | quo.
           | 
           | Oh man, you had me going with such a vivid metaphor. I was
           | really hoping for a payoff in the end, but you abandoned it.
           | The easy close would be "probably involves at least _some_
           | understanding of the existing terrain " but I was optimistic
           | for something less prosaic.
        
             | mindcrime wrote:
             | Sorry to disappoint. My creative juices aren't flowing
             | today I guess. Need more coffee, or something!
        
         | pavon wrote:
         | What a waste it would be to think you are pursuing a different
         | path only to discover you spent a year reinventing something
         | that you could have learned by reading papers for a few days.
        
         | GuB-42 wrote:
         | If you want to be off the beaten path, you have to know where
         | the beaten path is.
         | 
         | Otherwise you may end up walking the ditch beside the beaten
         | path. It is slow and difficult, but it won't get you anywhere
         | new.
         | 
         | For example, you may try an approach that doesn't look like
         | deep learning, but after a lot of work, realize that you
         | actually reinvented deep learning, poorly. We call these things
         | neurons, transformers, backpropagation, etc... but in the end,
         | it is just maths. If you end up finding that your "alternative"
         | ends up being very well suited to linear algebra and gradient
         | descent, once you have found the right formulas, you may
         | realize that they are equivalent to the ones used in
         | traditional "deep learning" algorithms. It help to recognize
         | this early and take advantage of all the work done before you.
        
         | ramraj07 wrote:
         | This is pretty much the same deal in biology as well. At
         | calico, at verily, at CZI, even at Allen, same story - they say
         | they will reinvent biology research and then go get the same
         | narrow minded professors and CEOs who run the status quo and
         | end up as one more of the same stuff.
         | 
         | Neuralink is the only place where this pattern seemed to break
         | a bit but then seems like Elon came into his own path with
         | trying to push for faster results and breaking basic ethics.
        
       | chrgy wrote:
       | From ChatGPT, although personally I think this list is bit old
       | but should be at the 60% mark at the very least Deep Learning:
       | 
       | AlexNet (2012) VGGNet (2014) ResNet (2015) GoogleNet (2015)
       | Transformer (2017) Reinforcement Learning:
       | 
       | Q-Learning (Watkins & Dayan, 1992) SARSA (R. S. Sutton & Barto,
       | 1998) DQN (Mnih et al., 2013) A3C (Mnih et al., 2016) PPO
       | (Schulman et al., 2017) Natural Language Processing:
       | 
       | Word2Vec (Mikolov et al., 2013) GLUE (Wang et al., 2018) ELMo
       | (Peters et al., 2018) GPT (Radford et al., 2018) BERT (Devlin et
       | al., 2019)
        
       | throwaway4837 wrote:
       | Wow, crazy coincidence that you all read this article yesterday
       | too. I was thinking of emailing one of them for the list, then I
       | fell asleep. Cold emails to scientists generally have a higher
       | success-rate than average in my experience.
        
       | theusus wrote:
       | like papers are that comprehensible.
        
       | databroker wrote:
       | [dead]
        
       | username3 wrote:
       | They asked on Twitter and he didn't reply. We need someone with a
       | blue check mark to ask.
       | https://twitter.com/ifree0/status/1620855608839897094
        
         | mirekrusin wrote:
         | Ask Elon to ask him.
        
       | touringa wrote:
       | https://lifearchitect.ai/papers/
        
       | layer8 wrote:
       | [flagged]
        
         | siekmanj wrote:
         | "RL: A Deep Reinforcement Learning Framework" seems to have
         | been hallucinated, does not exist.
        
           | homarp wrote:
           | https://arxiv.org/abs/1611.02779 is the closest - RL2: Fast
           | Reinforcement Learning via Slow Reinforcement Learning
        
         | nathias wrote:
         | I got:
         | 
         | Some of the highly influential papers in the field of AI that
         | could have been on the list include "Generative Adversarial
         | Networks" by Ian Goodfellow et al., "Attention is All You Need"
         | by Vaswani et al., "AlexNet: ImageNet Classification with Deep
         | Convolutional Neural Networks" by Alex Krizhevsky et al.,
         | "Playing Atari with Deep Reinforcement Learning" by Volodymyr
         | Mnih et al., "Human-level control through deep reinforcement
         | learning" by Volodymyr Mnih et al., "A Few Useful Things to
         | Know About Machine Learning" by Pedro Domingos, among many
         | others.
        
         | caxco93 wrote:
         | This comment feels very ChatGPTy
        
       | Phil_Latio wrote:
       | Not in the list: https://arxiv.org/pdf/1805.09001.pdf
        
       | dang wrote:
       | Recent and related:
       | 
       |  _John Carmack's 'Different Path' to Artificial General
       | Intelligence_ - https://news.ycombinator.com/item?id=34637650 -
       | Feb 2023 (402 comments)
        
       | mritchie712 wrote:
       | [flagged]
        
       | KRAKRISMOTT wrote:
       | Start tweeting at him until he shares
        
         | fnordpiglet wrote:
         | Clearly do this by tweet storming him via LLM
        
           | steveBK123 wrote:
           | As an AI LLM, I cannot decide which academic papers are
           | "best" as the idea of "best" is subjective and there are many
           | different factors that need to be considered.
        
             | cwillu wrote:
             | I apologize for the oversight, you are correct. Let me know
             | if there's anything else I can help you with.
        
       | sho_hn wrote:
       | > "You'll find people who can wax rhapsodic about the singularity
       | and how everything is going to change with AGI. But if I just
       | look at it and say, if 10 years from now, we have 'universal
       | remote employees' that are artificial general intelligences, run
       | on clouds, and people can just dial up and say, 'I want five
       | Franks today and 10 Amys, and we're going to deploy them on these
       | jobs,' and you could just spin up like you can cloud-access
       | computing resources, if you could cloud-access essentially
       | artificial human resources for things like that--that's the most
       | prosaic, mundane, most banal use of something like this."
       | 
       | So, slavery?
        
         | aj7 wrote:
         | Computer time is paid for.
        
           | sho_hn wrote:
           | Will the AIs own the computers?
        
         | i_s wrote:
         | Sounds like 'Age of Em' by Robin Hanson: https://ageofem.com/
        
         | hosolmaz wrote:
         | Related: https://qntm.org/mmacevedo
        
           | sho_hn wrote:
           | I was quoting "Measure of a Man" :-)
           | 
           | "Lena" is a bit of different case because it's not AGI.
           | Probably ripe for the "forced prison labor" suggested by your
           | sibling as the moral cop-out. Imagine being sentenced to
           | being a cloud VM image!
        
             | EamonnMR wrote:
             | Is there a good way to distinguish between the brain dumps
             | in Lena and what you'd call an AGI?
        
               | sho_hn wrote:
               | A brain dump has a history, and we ascribe meaning to the
               | past. As mentioned the thread here has mentioned forced
               | prison labor as a form of socially acceptable slavery,
               | and society could convince itself that a given brain dump
               | deserves its fate, even that it is a form of atonement.
               | 
               | Artifical life on the other hand is presumably "pure at
               | birth".
               | 
               | Of course it's not that easy. You could discuss whether
               | individual instances have unique sets of human rights,
               | and value potential futures over pasts.
        
         | klabb3 wrote:
         | I think there's broad consensus that slavery only applies to
         | human labor. Even within that spectrum people avoid the term
         | (see forced prison labor). We also don't use it for animal
         | labor, for instance.
        
           | mike_d wrote:
           | If we ever conjure a way to capture the human consciousness
           | and preserve it before death, "AI" will be based on
           | indentured servitude.
           | 
           | The people given a second chance at life will be the ones who
           | are quickest at identifying traffic signals or fire hydrants
           | from a line up of images.
        
           | sho_hn wrote:
           | > animal labor
           | 
           | The context uses human-like/human-level a lot, but I agree
           | what level and type of intelligence commands human respect is
           | tricky business.
           | 
           | > forced prison labor
           | 
           | Would be interesting if we found ways to convince ourselves
           | the AIs had it coming.
           | 
           | Generally speaking, slavery has been morally acceptable and
           | popular before, and I will also not be surprised if we return
           | to those ways.
        
           | bathtub365 wrote:
           | Human slaves were often considered to be less than human or,
           | at the very least, not deserving of basic rights that other
           | humans enjoyed, as part of the moral and ethical frameworks
           | that supported the practice. I think we might see the same
           | shift in dominant ideology if we do have "true" AGI. I'm sure
           | I could be convinced that an intelligence that develops and
           | grows over a number of years begins to have a right to exist
           | and a right to freedom of expression and movement.
        
             | sho_hn wrote:
             | Given the outcry/backlash over Dall-E/ChatGPT (what is
             | "real art", etc.) and how much of our society is permeated
             | by a search for authenticity (perceived) already, I wonder
             | if you're right. We might decide "artifical" lifeforms are
             | a lower class than "evolved in nature". For many religions
             | this could be a natural take - made by God vs. folly of
             | man, etc.
        
       | zomglings wrote:
       | [flagged]
        
         | winrid wrote:
         | The description in the book is the cat was such a pain it was
         | "not adding value" to his life.
         | 
         | That's a bit more detail than just peeing on the sofa once.
        
         | Waterluvian wrote:
         | The story is apparently a bit more complex.
         | 
         | The cat was having a lot of behavioural issues and ultimately
         | he surrendered it to a shelter, where it may have been
         | euthanized if nobody adopted it. (note that nothing I'm saying
         | here is meant to condone or condemn the action)
         | 
         | The author editorialized it to fit the desired narrative, which
         | is a thing that happens quite a lot. Gotta sell them books!
        
           | ncann wrote:
           | There's a big difference between "had his cat put down" and
           | "surrendered it to a shelter". If it's indeed the later (I
           | know nothing about the story) then there's nothing to talk
           | about.
        
         | tayo42 wrote:
         | Try to keep reminding your self you only respect them for a
         | tiny bit of specific knowledge. You don't need to like the
         | person they are.
         | 
         | Eaiser said then done still for sure. I have a couple hobbies
         | where the top tier people are just annoying people in the rest
         | of their life. Something about being really good at one thing
         | seems to also correlate often with other insane personality
         | traits
        
         | unixhero wrote:
         | I don't care. The cat was probably put down painlessly at a
         | vet. I don't see the issue at all. Let's not do these
         | cancelling attempts.
        
           | icepat wrote:
           | Having an animal killed for no good reason, other than it
           | caused you problems, is a bit twisted. In situations like
           | this, rehoming them is the ethical thing to do.
           | 
           | This is literally saying 'I don't care if he's killing
           | kittens'.
        
             | unixhero wrote:
             | Ask any vet if this happens all the time.
        
               | icepat wrote:
               | Yeah, but that does not mean it's not wildly unethical.
        
             | xnickb wrote:
             | Still completely irrelevant for the discussion
        
             | hungryforcodes wrote:
             | I enjoyed my meatballs today, btw!I'm joking in the sense I
             | didn't eat meatballs today, but we kill animals all the
             | time. That's what humans do. Do you eat meat or foods
             | cooked with animal fats?
             | 
             | I think you get my point.
             | 
             | A life is a life. I don't see why pets are somehow more
             | important than other animals.
        
           | headhasthoughts wrote:
           | In your opinion, was O.J. Simpson "cancelled?" Hans Reiser?
           | 
           | At what point do we draw a line and accept that a person can
           | be judged negatively for harm they cause?
        
           | ryanSrich wrote:
           | Why not put it up for adoption? Having it intentionally
           | killed is psychotic.
        
           | LeonenTheDK wrote:
           | Having a living thing killed because it did something you
           | don't like, that it doesn't understand is wrong, is so
           | incredibly heartless. Pets aren't toys meant to be tossed
           | when you get tired of them, they're living creatures
           | deserving of a good life.
           | 
           | Now if it was peeing everywhere because of a terminal medical
           | issue causing it pain, that's a bit of a different story. I
           | don't know the situation he and the cat were in. But if it is
           | just as the parent comment says, why kill the thing instead
           | of giving it up for a chance at a better life?
        
       | winwhiz wrote:
       | I had read that somewhere else and this is as far as I got
       | 
       | https://twitter.com/id_aa_carmack/status/1241219019681792010
        
       | klaussilveira wrote:
       | Following:
       | https://twitter.com/u3dcommunity/status/1621524851898089478?...
        
         | sebkomianos wrote:
         | Following: https://news.ycombinator.com/item?id=34643510
        
       | Liberonostrud wrote:
       | [flagged]
        
       | [deleted]
        
       | polskibus wrote:
       | What about just asking Carmack on twitter?
        
         | arbuge wrote:
         | Or, more directly, ask Sutskever...
        
         | jranieri wrote:
         | I did, without success.
        
           | belter wrote:
           | I asked him too.
           | 
           | He said:
           | 
           | - Who are you, and how did you get into my house?
        
             | wincy wrote:
             | I wouldn't advise this after seeing what Carmack did to
             | that guy he got in a headlock. [0] "That was the tap part",
             | makes me laugh every time.
             | 
             | [0] https://m.youtube.com/watch?v=X68Mm_kYRjc
        
           | TigeriusKirk wrote:
           | Is anyone asking Ilya Sutskever?
        
       | EvgeniyZh wrote:
       | Attention, scaling laws, diffusion, vision transformers,
       | Bert/Roberta, CLIP, chinchilla, chatgpt-related papers, nerf,
       | flamingo, RETRO/some retrieval sota
        
         | [deleted]
        
         | seydor wrote:
         | what do you mean 'scaling laws'?
        
           | EvgeniyZh wrote:
           | J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess,
           | R. Child, S. Gray, A. Radford, J. Wu, and D. Amodei. Scaling
           | laws for neural language models. arXiv preprint
           | arXiv:2001.08361, 2020.
           | 
           | and multiple follow-ups
        
       | jimmySixDOF wrote:
       | >90% of what matters today
       | 
       | Strikes me as the kind of thing where that last 10% will need 400
       | papers
        
         | mindcrime wrote:
         | _" The first 90% is easy. It's the second 90% that kills ya."_
        
           | kabdib wrote:
           | "All projects are divided into three phases, each consisting
           | of 90% of the work."
           | 
           | -- _just about everything I 've shipped :-)_
        
         | michpoch wrote:
         | For the last 10% you'll need to write a paper yourself.
        
         | swyx wrote:
         | maybe thats the part he intends to deviate. he just doesnt need
         | to reinvent the settled science.
        
         | tikhonj wrote:
         | Along with the kind of details and tacit knowledge that never
         | makes it into papers...
        
       | sillysaurusx wrote:
       | "The email including them got lost to Meta's two-year auto-delete
       | policy by the time I went back to look for it last year. I have a
       | binder with a lot of them printed out, but not all of them."
       | 
       | RIP. If it's any consolation, it sounds like the list is at least
       | three years old by now. Which is a long time considering that
       | 2016 is generally regarded as the date of the deep learning
       | revolution.
        
         | pengaru wrote:
         | > If it's any consolation, it sounds like the list is at least
         | three years old by now.
         | 
         | In my experience when it comes to learning technical subjects
         | from a position of relative total ignorance, it's the older
         | resources that are the easiest to bootstrap knowledge from.
         | Then you basically work your way forward through the newer
         | texts, like an accelerated replay of a domain's progress.
         | 
         | I think it's kind of obvious that this would be the case when
         | you think about it. Just like how history textbooks can't keep
         | growing in size to give all past events an equal treatment, nor
         | can technical references as a domain matures.
         | 
         | You're forced to toss out stuff deemed least relevant to today,
         | and in technical domains that's often stuff you've just started
         | assuming as understood by the reader... where early editions of
         | a new space would have prioritized getting the reader up to
         | speed in something totally novel to the world.
        
         | moglito wrote:
         | "considering that 2016 is generally regarded as the date of the
         | deep learning revolution" --
         | 
         | I thought it was 2012, when AlexNet took the imagenet crown?
        
           | sillysaurusx wrote:
           | That's probably fair. But you'd be hard-pressed to find a DL
           | stack to try out your ideas with prior to 2016, since that's
           | when Tensorflow launched. :)
           | 
           | (Gosh, it's been less than a decade. Time sometimes doesn't
           | fly, considering how much it's changed the world since
           | then...)
        
             | abrichr wrote:
             | Theano was first released in 2007.
        
               | sillysaurusx wrote:
               | That's actually fascinating. Were there many experiments
               | done in it back in the 00's?
               | 
               | I'm just trying to imagine the things you could do with
               | it back then. 2007 had relatively fast gpus for the time,
               | but certainly nothing compared to today. Yet it'd
               | certainly be enough for MNIST training, which makes me
               | wonder what else could be done.
        
               | abrichr wrote:
               | You can look at Yoshua Bengio's Google Scholar profile
               | [1] and scroll down to see what they were working on
               | around that time.
               | 
               | Here are some papers with many citations:
               | 
               | - An empirical evaluation of deep architectures on
               | problems with many factors of variation [2]
               | 
               | - Extracting and composing robust features with denoising
               | autoencoders [3]
               | 
               | - Scaling learning algorithms towards AI [4]
               | 
               | [1] https://scholar.google.com/citations?hl=en&user=kukA0
               | LcAAAAJ...
               | 
               | [2] https://scholar.google.com/citations?view_op=view_cit
               | ation&h...
               | 
               | [3] https://scholar.google.com/citations?view_op=view_cit
               | ation&h...
               | 
               | [4] https://scholar.google.com/citations?view_op=view_cit
               | ation&h...
        
               | ladberg wrote:
               | FWIW in 2016 I was at an ML team at Apple that had been
               | shipping production neural networks on-device for a while
               | already. At the everyone used an assortment of random
               | tools (Theano, Torch, Caffe). I worked on an internal
               | tool that originally started as a Theano fork but was
               | closer to a modern-day Tensorflow XLA (and has since been
               | axed in favor of Tensorflow for most teams).
        
       | ilaksh wrote:
       | My guess is that multimodal transformers will probably eventually
       | get us most of the way there for general purpose AI.
       | 
       | But AGI is one of those very ambiguous terms. For many people
       | it's either an exact digital replica of human behavior that is
       | alive, or something like a God. I think it should also apply to
       | general purpose AI that can do most human tasks in a strictly
       | guided way, although not have other characteristics of humans or
       | animals. For that I think it can be built on advanced multimodal
       | transformer-based architectures.
       | 
       | For the other stuff, it's worth giving a passing glance to the
       | fairly extensive amount of research that has been labeled AGI
       | over the last decade or so. It's not really mainstream except
       | maybe the last couple of years because really forward looking
       | people tend to be marginalized including in academia.
       | 
       | https://agi-conf.org
       | 
       | Looking forward, my expectation is that things like memristors or
       | other compute-in-memory will become very popular within say 2-5
       | years (obviously total speculation since there are no products
       | yet that I know of) and they will be vastly more efficient and
       | powerful especially for AI. And there will be algorithms for
       | general purpose AI possibly inspired by transformers or AGI
       | research but tailored to the new particular compute-in-memory
       | systems.
        
         | TimPC wrote:
         | Why do you think multimodal transformers will get us anywhere
         | near general purpose AI? Multimodal transformers are basically
         | a technology for sequence-to-sequence intelligent mappings and
         | it seems to me extremely unlikely that general intelligence is
         | one or more specific sequence-to-sequence mappings. Many
         | specific purpose problems are sequence-to-sequence but these
         | tend to be specialized functionalities operating in one or more
         | specific domains.
        
           | RC_ITR wrote:
           | A lot of people don't really _get_ that our brains are a
           | bunch of specialized subcomponents that work in concert (Your
           | pre-frontal cortex just cannot beat your heart, not matter
           | how optimized it gets). This is unsurprising, as our brains
           | are one of the most complex /hard to monitor things on earth.
           | 
           | When an artificial tool that is really a _point solution_
           | "tricks" us into thinking it has replicated a task that
           | requires complex multi-component functioning within our
           | brain, we assume the tool is acting like our brain is acting.
           | 
           | The joke of course being that if you maliciously edited GPT's
           | index for translating vectors to words, it would produce
           | gibberish and we wouldn't care (despite being the exact same
           | core model).
           | 
           | We are only impressed by the complex sequence to sequence
           | strings it makes because the tokens happen to be words
           | (arguable the most important things in our lives).
           | 
           | EDIT: a great historic metaphor for this is how we thought
           | about 'computer vision' and CNN's. They do great at
           | identifying things in images, but notice that we still use
           | image-based captcha's (Even on OpenAI sites no less!)?
           | 
           | That's because it turns out optical illusions and context-
           | heavy images are things that CNN's really struggle at (since
           | the problem space is bigger than 'how are these pixels
           | arranged')
        
           | ilaksh wrote:
           | A couple of things.
           | 
           | 1) As I said, many people have different ideas of what we are
           | talking about. I assume that for you general purpose AI has
           | more capabilities, such as the ability to quickly learn tasks
           | to a high level on the fly. For me, it still qualifies as
           | general purpose if it can do most tasks but relies on a lot
           | of pre-training and let's say knowledgebase look up.
           | 
           | 2) It seems obvious to me that ChatGPT proves a general
           | purpose utility for these types of LLMs, and it is easy to
           | speculate that something similar but with visual input/output
           | also will be even more general. And so we are just looking at
           | a matter of degree by that definition.
        
         | mirekrusin wrote:
         | AGI will be AI which can improve it's own code after N
         | iterations where N will be blurry.
        
       | [deleted]
        
       | [deleted]
        
       ___________________________________________________________________
       (page generated 2023-02-03 23:02 UTC)