[HN Gopher] Ask HN: What were the papers on the list Ilya Sutske...
___________________________________________________________________
Ask HN: What were the papers on the list Ilya Sutskever gave John
Carmack?
John Carmack's new interview on AI/AGI [1] carries a puzzle: "So I
asked Ilya Sutskever, OpenAI's chief scientist, for a reading list.
He gave me a list of like 40 research papers and said, 'If you
really learn all of these, you'll know 90% of what matters today.'
And I did. I plowed through all those things and it all started
sorting out in my head." What papers do you think were on this
list? [1] https://dallasinnovates.com/exclusive-qa-john-carmacks-
different-path-to-artificial-general-intelligence/
Author : alan-stark
Score : 338 points
Date : 2023-02-03 14:24 UTC (8 hours ago)
| albertzeyer wrote:
| (Partly copied from
| https://news.ycombinator.com/item?id=34640251.)
|
| On models: Obviously, almost everything is Transformer nowadays
| (Attention is all you need paper). However, I think to get into
| the field, to get a good overview, you should also look a bit
| beyond the Transformer. E.g. RNNs/LSTMs are still a must learn,
| even though Transformers might be better in many tasks. And then
| all those memory-augmented models, e.g. Neural Turing Machine and
| follow-ups, are important too.
|
| It also helps to know different architectures, such as just
| language models (GPT), attention-based encoder-decoder (e.g.
| original Transformer), but then also CTC, hybrid HMM-NN,
| transducers (RNN-T).
|
| Some self-promotion: I think my Phd thesis does a good job on
| giving an overview on this: https://www-i6.informatik.rwth-
| aachen.de/publications/downlo...
|
| Diffusion models is also another recent different kind of model.
|
| Then, a separate topic is the training aspect. Most papers do
| supervised training, using cross entropy loss to the ground-truth
| target. However, there are many others:
|
| There is CLIP to combine text and image modalities.
|
| There is the whole field on unsupervised or self-supervised
| training methods. Language model training (next label prediction)
| is one example, but there are others.
|
| And then there is the big field on reinforcement learning, which
| is probably also quite relevant for AGI.
| [deleted]
| hardware2win wrote:
| I do wonder whether people behind Attention is all you need
| paper
|
| Will receive Turing Award
|
| It is being cited often
| mirekrusin wrote:
| Guy who said - "I don't understand all of this, can we just
| throw more machines?" should get the award.
| albertzeyer wrote:
| The authors did not really expect it to be such a huge
| influence. You could also argue, it is a somewhat natural
| next step. This paper did not invent self-attention nor
| attention. Attention was already very popular, specifically
| for machine translation, and a few other papers already did
| use self-attention at that point in time. It was just the
| first paper which solely used attention and self-attention
| and nothing else.
| RC_ITR wrote:
| >Will receive Turing Award
|
| This is the weird thing - hopefully not! Hopefully there's
| even better NN models coming out every 5-10 years and we look
| back on transformers as 'just a phase' sort of like how we
| look back at RNN's (which were no less of an amazing
| achievement - look at the proliferation of voice assistants),
| as potentially obsolete technology today.
|
| Fore example, attention is great and does a really good job
| of simulating context in language, but what if we come up
| with a clever way to simulate symbology? Then we actually are
| back on the path to AGI and transformers will look like
| child's play.
| Beldin wrote:
| > _symbology_
|
| Off-topic, but now I have William Dafoe going "What's the
| 'symbology' here? The _symbolism_ ... " in my head (from
| Boondock Saints).
| Gee101 wrote:
| Even thou I watched that movie 20 years ago. I will never
| forget that scene.
| mattcaldwell wrote:
| Came here expecting a Haiku.
| qwertyforce wrote:
| Neural nets advance,
|
| Attention is all you need,
|
| Computing ascends.
|
| #by chatgpt
| maxbond wrote:
| The authors who wrote
|
| "Attention is all you need" -
|
| Turing candidates?
| fastball wrote:
| The people behind
|
| "Attention is all you need"
|
| Are often cited
| andrelaszlo wrote:
| Attention.
|
| Attention.
|
| Attention.
|
| - Ikkyu
| modeless wrote:
| The Adam optimizer is another possibility. It's unbelievably
| good and everyone uses it.
| seydor wrote:
| I remember an interview with one of the founders of openAI,
| saying that if it wasn't the transformer architecture it
| would be something else. What really matters is the scale of
| the model. The transformer is only one of the possible
| configurations that work well with text. It seems they stuck
| to it because it is really so good so why break things.
| alan-stark wrote:
| Thanks for sharing. Cool to see someone from Aachen NLP group.
| I'll be visiting Aachen/Dusseldorf/Heidelberg area in spring.
| Do you know of any local ML meetups open to general (ML
| engineer/programmer) public?
| albertzeyer wrote:
| Unfortunately, not really. We used to have some RWTH internal
| meetups, although that has been somewhat interrupted since
| Corona, and not really recovered afterwards.
|
| Aachen has quite a few companies with activity on NLP or
| speech recognition, mostly due to my professor Hermann Ney.
| E.g. there is Apple, Amazon, Nuance, eBay. And lesser-known
| AppTek. And in Cologne, you have DeepL. In all those
| companies, you find many people from our group. And then, at
| the RWTH Aachen University, you have our NLP/speech group,
| and also the computer vision group.
| hexhowells wrote:
| While not all papers, this list contains a lot of important
| papers, writings, and conversations currently in AI:
| https://docs.google.com/document/d/1bEQM1W-1fzSVWNbS4ne5PopB...
| querez wrote:
| A lot of other posts here are biased to recent papers, and papers
| that had "a big impact", but miss a lot of foundations. I think
| this reddit post on the most foundational ML papers gives a lot
| more balanced overview:
| https://www.reddit.com/r/MachineLearning/comments/zetvmd/d_i...
| cloudking wrote:
| Ilya's publications may be on the list
| https://scholar.google.com/citations?user=x04W_mMAAAAJ&hl=en
| mgaunard wrote:
| In my experience, all deep learning is overhyped, and most needs
| that are not already addressable by linear regressions can be
| done so with simple supervised learning.
| optimalsolver wrote:
| Carmack says he's pursuing a different path to AGI, then goes
| straight to the guy at the center of the most saturated area of
| machine learning (deep learning)?
|
| I would've hoped he'd be exploring weirder alternatives off the
| beaten path. I mean, neural networks might not even be necessary
| for AGI, but no one at OpenAI is going to tell Carmack that.
| albertzeyer wrote:
| It is possible to use neural networks and still be on a quite
| different path than the mainstream.
|
| Of course, there are a group of people defending the symbolic
| computation, e.g. see Gary Marcus, and always pushing back on
| connectionism (neural networks).
|
| But this is somewhat a spectrum, or also rather sloppy
| terminology. Once you go away from symbolic computation, many
| things can be interpret as neural network. And there is also
| all the computational neuroscience, which also work with some
| variants of neural networks.
|
| And there is the human brain, which demonstrates, that a neural
| network is capable of doing AGI. So why would you not want a
| neural network? But that does not say that you can do many
| things very different from mainstream.
| throwaway4837 wrote:
| Did you read the full article? In science, you should usually
| have a very solid understanding of what the top minds in the
| field are fixated on as it allows you to try something
| different with confidence, and prevents you from pulling a
| Ramanujan, reinventing the exact same wheel. I can't think of a
| single scientist who caused a paradigm shift and didn't have an
| intimate understanding of the current status quo.
| ly3xqhl8g9 wrote:
| The most off the beaten path to AGI I heard through the
| grapevine is to not have artificial neural networks, as in
| algorithms involving matmul running on silicon, at all. But
| instead, going on the path of the laziest engineer is the best
| engineer, to rely on the fact that neurons, actual neurons from
| someone's brain, already "know" how to make efficient, good-
| enough, general learning architectures and therefore in order
| to obtain programmatic human-like intelligence one would
| 'simply'+ have to implant them not in mice [1] but in an actual
| vat and 'simply' interface with the whatever a group of neurons
| can be called, a soma(?). Given this Brain-on-a-Chip
| architecture, we wouldn't have to stick GPUs in our cars to
| achieve self-driving, but even more wetware (and of course,
| ignore the occasional screams of dread as the wetware becomes
| aware of themselves and how condemned they are to an existence
| of left-right-accelerate-break).
|
| It would have been interesting seeing someone like Carmack
| going in this direction, but from the little details he gave he
| seems less interested in cells and Kjeldahl flasks and more of
| the same type-a-type-a on the ol' QWERTY.
|
| + 'simply' might involve multiple decades of research and
| Buffett knows how many billions
|
| [1] Human neurons implanted in mice influence behavior,
| https://www.nature.com/articles/s41586-022-05277-w
| mindcrime wrote:
| Wouldn't it be fair to say that one has to know what the
| current path _is_ and have some idea where it leads and what
| its issues are, before forging a new path?
|
| I mean, any idiot can go off-trail and start blundering around
| in the weeds, and ultimately wind up tripping, falling, hitting
| their head on a rock, and drowning to death in a ditch. But
| actually finding a new, better, more efficient path probably
| involves at least _some_ understanding of the status quo.
| someweirdperson wrote:
| To walk a path no knowledge of the existing is needed. But to
| be able to claim it is new it is. Even more so to be able to
| claim that the new is better.
| fnordpiglet wrote:
| Bias and ignorance are two different things. No knowledge
| is ignorance. Bias is using knowledge to judge new
| knowledge. The goal isn't to pursue things with raging
| ignorance but to pursue them with no bias and collecting
| knowledge without conclusion, then once you're
| knowledgeable of what is there you can take off with raging
| ignorance in the direction no one has gone before. But you
| can't do than holding bias any more than you can having
| ignorance of what directions have been gone before.
| agar wrote:
| > probably involves at least some understanding of the status
| quo.
|
| Oh man, you had me going with such a vivid metaphor. I was
| really hoping for a payoff in the end, but you abandoned it.
| The easy close would be "probably involves at least _some_
| understanding of the existing terrain " but I was optimistic
| for something less prosaic.
| mindcrime wrote:
| Sorry to disappoint. My creative juices aren't flowing
| today I guess. Need more coffee, or something!
| pavon wrote:
| What a waste it would be to think you are pursuing a different
| path only to discover you spent a year reinventing something
| that you could have learned by reading papers for a few days.
| GuB-42 wrote:
| If you want to be off the beaten path, you have to know where
| the beaten path is.
|
| Otherwise you may end up walking the ditch beside the beaten
| path. It is slow and difficult, but it won't get you anywhere
| new.
|
| For example, you may try an approach that doesn't look like
| deep learning, but after a lot of work, realize that you
| actually reinvented deep learning, poorly. We call these things
| neurons, transformers, backpropagation, etc... but in the end,
| it is just maths. If you end up finding that your "alternative"
| ends up being very well suited to linear algebra and gradient
| descent, once you have found the right formulas, you may
| realize that they are equivalent to the ones used in
| traditional "deep learning" algorithms. It help to recognize
| this early and take advantage of all the work done before you.
| ramraj07 wrote:
| This is pretty much the same deal in biology as well. At
| calico, at verily, at CZI, even at Allen, same story - they say
| they will reinvent biology research and then go get the same
| narrow minded professors and CEOs who run the status quo and
| end up as one more of the same stuff.
|
| Neuralink is the only place where this pattern seemed to break
| a bit but then seems like Elon came into his own path with
| trying to push for faster results and breaking basic ethics.
| chrgy wrote:
| From ChatGPT, although personally I think this list is bit old
| but should be at the 60% mark at the very least Deep Learning:
|
| AlexNet (2012) VGGNet (2014) ResNet (2015) GoogleNet (2015)
| Transformer (2017) Reinforcement Learning:
|
| Q-Learning (Watkins & Dayan, 1992) SARSA (R. S. Sutton & Barto,
| 1998) DQN (Mnih et al., 2013) A3C (Mnih et al., 2016) PPO
| (Schulman et al., 2017) Natural Language Processing:
|
| Word2Vec (Mikolov et al., 2013) GLUE (Wang et al., 2018) ELMo
| (Peters et al., 2018) GPT (Radford et al., 2018) BERT (Devlin et
| al., 2019)
| throwaway4837 wrote:
| Wow, crazy coincidence that you all read this article yesterday
| too. I was thinking of emailing one of them for the list, then I
| fell asleep. Cold emails to scientists generally have a higher
| success-rate than average in my experience.
| theusus wrote:
| like papers are that comprehensible.
| databroker wrote:
| [dead]
| username3 wrote:
| They asked on Twitter and he didn't reply. We need someone with a
| blue check mark to ask.
| https://twitter.com/ifree0/status/1620855608839897094
| mirekrusin wrote:
| Ask Elon to ask him.
| touringa wrote:
| https://lifearchitect.ai/papers/
| layer8 wrote:
| [flagged]
| siekmanj wrote:
| "RL: A Deep Reinforcement Learning Framework" seems to have
| been hallucinated, does not exist.
| homarp wrote:
| https://arxiv.org/abs/1611.02779 is the closest - RL2: Fast
| Reinforcement Learning via Slow Reinforcement Learning
| nathias wrote:
| I got:
|
| Some of the highly influential papers in the field of AI that
| could have been on the list include "Generative Adversarial
| Networks" by Ian Goodfellow et al., "Attention is All You Need"
| by Vaswani et al., "AlexNet: ImageNet Classification with Deep
| Convolutional Neural Networks" by Alex Krizhevsky et al.,
| "Playing Atari with Deep Reinforcement Learning" by Volodymyr
| Mnih et al., "Human-level control through deep reinforcement
| learning" by Volodymyr Mnih et al., "A Few Useful Things to
| Know About Machine Learning" by Pedro Domingos, among many
| others.
| caxco93 wrote:
| This comment feels very ChatGPTy
| Phil_Latio wrote:
| Not in the list: https://arxiv.org/pdf/1805.09001.pdf
| dang wrote:
| Recent and related:
|
| _John Carmack's 'Different Path' to Artificial General
| Intelligence_ - https://news.ycombinator.com/item?id=34637650 -
| Feb 2023 (402 comments)
| mritchie712 wrote:
| [flagged]
| KRAKRISMOTT wrote:
| Start tweeting at him until he shares
| fnordpiglet wrote:
| Clearly do this by tweet storming him via LLM
| steveBK123 wrote:
| As an AI LLM, I cannot decide which academic papers are
| "best" as the idea of "best" is subjective and there are many
| different factors that need to be considered.
| cwillu wrote:
| I apologize for the oversight, you are correct. Let me know
| if there's anything else I can help you with.
| sho_hn wrote:
| > "You'll find people who can wax rhapsodic about the singularity
| and how everything is going to change with AGI. But if I just
| look at it and say, if 10 years from now, we have 'universal
| remote employees' that are artificial general intelligences, run
| on clouds, and people can just dial up and say, 'I want five
| Franks today and 10 Amys, and we're going to deploy them on these
| jobs,' and you could just spin up like you can cloud-access
| computing resources, if you could cloud-access essentially
| artificial human resources for things like that--that's the most
| prosaic, mundane, most banal use of something like this."
|
| So, slavery?
| aj7 wrote:
| Computer time is paid for.
| sho_hn wrote:
| Will the AIs own the computers?
| i_s wrote:
| Sounds like 'Age of Em' by Robin Hanson: https://ageofem.com/
| hosolmaz wrote:
| Related: https://qntm.org/mmacevedo
| sho_hn wrote:
| I was quoting "Measure of a Man" :-)
|
| "Lena" is a bit of different case because it's not AGI.
| Probably ripe for the "forced prison labor" suggested by your
| sibling as the moral cop-out. Imagine being sentenced to
| being a cloud VM image!
| EamonnMR wrote:
| Is there a good way to distinguish between the brain dumps
| in Lena and what you'd call an AGI?
| sho_hn wrote:
| A brain dump has a history, and we ascribe meaning to the
| past. As mentioned the thread here has mentioned forced
| prison labor as a form of socially acceptable slavery,
| and society could convince itself that a given brain dump
| deserves its fate, even that it is a form of atonement.
|
| Artifical life on the other hand is presumably "pure at
| birth".
|
| Of course it's not that easy. You could discuss whether
| individual instances have unique sets of human rights,
| and value potential futures over pasts.
| klabb3 wrote:
| I think there's broad consensus that slavery only applies to
| human labor. Even within that spectrum people avoid the term
| (see forced prison labor). We also don't use it for animal
| labor, for instance.
| mike_d wrote:
| If we ever conjure a way to capture the human consciousness
| and preserve it before death, "AI" will be based on
| indentured servitude.
|
| The people given a second chance at life will be the ones who
| are quickest at identifying traffic signals or fire hydrants
| from a line up of images.
| sho_hn wrote:
| > animal labor
|
| The context uses human-like/human-level a lot, but I agree
| what level and type of intelligence commands human respect is
| tricky business.
|
| > forced prison labor
|
| Would be interesting if we found ways to convince ourselves
| the AIs had it coming.
|
| Generally speaking, slavery has been morally acceptable and
| popular before, and I will also not be surprised if we return
| to those ways.
| bathtub365 wrote:
| Human slaves were often considered to be less than human or,
| at the very least, not deserving of basic rights that other
| humans enjoyed, as part of the moral and ethical frameworks
| that supported the practice. I think we might see the same
| shift in dominant ideology if we do have "true" AGI. I'm sure
| I could be convinced that an intelligence that develops and
| grows over a number of years begins to have a right to exist
| and a right to freedom of expression and movement.
| sho_hn wrote:
| Given the outcry/backlash over Dall-E/ChatGPT (what is
| "real art", etc.) and how much of our society is permeated
| by a search for authenticity (perceived) already, I wonder
| if you're right. We might decide "artifical" lifeforms are
| a lower class than "evolved in nature". For many religions
| this could be a natural take - made by God vs. folly of
| man, etc.
| zomglings wrote:
| [flagged]
| winrid wrote:
| The description in the book is the cat was such a pain it was
| "not adding value" to his life.
|
| That's a bit more detail than just peeing on the sofa once.
| Waterluvian wrote:
| The story is apparently a bit more complex.
|
| The cat was having a lot of behavioural issues and ultimately
| he surrendered it to a shelter, where it may have been
| euthanized if nobody adopted it. (note that nothing I'm saying
| here is meant to condone or condemn the action)
|
| The author editorialized it to fit the desired narrative, which
| is a thing that happens quite a lot. Gotta sell them books!
| ncann wrote:
| There's a big difference between "had his cat put down" and
| "surrendered it to a shelter". If it's indeed the later (I
| know nothing about the story) then there's nothing to talk
| about.
| tayo42 wrote:
| Try to keep reminding your self you only respect them for a
| tiny bit of specific knowledge. You don't need to like the
| person they are.
|
| Eaiser said then done still for sure. I have a couple hobbies
| where the top tier people are just annoying people in the rest
| of their life. Something about being really good at one thing
| seems to also correlate often with other insane personality
| traits
| unixhero wrote:
| I don't care. The cat was probably put down painlessly at a
| vet. I don't see the issue at all. Let's not do these
| cancelling attempts.
| icepat wrote:
| Having an animal killed for no good reason, other than it
| caused you problems, is a bit twisted. In situations like
| this, rehoming them is the ethical thing to do.
|
| This is literally saying 'I don't care if he's killing
| kittens'.
| unixhero wrote:
| Ask any vet if this happens all the time.
| icepat wrote:
| Yeah, but that does not mean it's not wildly unethical.
| xnickb wrote:
| Still completely irrelevant for the discussion
| hungryforcodes wrote:
| I enjoyed my meatballs today, btw!I'm joking in the sense I
| didn't eat meatballs today, but we kill animals all the
| time. That's what humans do. Do you eat meat or foods
| cooked with animal fats?
|
| I think you get my point.
|
| A life is a life. I don't see why pets are somehow more
| important than other animals.
| headhasthoughts wrote:
| In your opinion, was O.J. Simpson "cancelled?" Hans Reiser?
|
| At what point do we draw a line and accept that a person can
| be judged negatively for harm they cause?
| ryanSrich wrote:
| Why not put it up for adoption? Having it intentionally
| killed is psychotic.
| LeonenTheDK wrote:
| Having a living thing killed because it did something you
| don't like, that it doesn't understand is wrong, is so
| incredibly heartless. Pets aren't toys meant to be tossed
| when you get tired of them, they're living creatures
| deserving of a good life.
|
| Now if it was peeing everywhere because of a terminal medical
| issue causing it pain, that's a bit of a different story. I
| don't know the situation he and the cat were in. But if it is
| just as the parent comment says, why kill the thing instead
| of giving it up for a chance at a better life?
| winwhiz wrote:
| I had read that somewhere else and this is as far as I got
|
| https://twitter.com/id_aa_carmack/status/1241219019681792010
| klaussilveira wrote:
| Following:
| https://twitter.com/u3dcommunity/status/1621524851898089478?...
| sebkomianos wrote:
| Following: https://news.ycombinator.com/item?id=34643510
| Liberonostrud wrote:
| [flagged]
| [deleted]
| polskibus wrote:
| What about just asking Carmack on twitter?
| arbuge wrote:
| Or, more directly, ask Sutskever...
| jranieri wrote:
| I did, without success.
| belter wrote:
| I asked him too.
|
| He said:
|
| - Who are you, and how did you get into my house?
| wincy wrote:
| I wouldn't advise this after seeing what Carmack did to
| that guy he got in a headlock. [0] "That was the tap part",
| makes me laugh every time.
|
| [0] https://m.youtube.com/watch?v=X68Mm_kYRjc
| TigeriusKirk wrote:
| Is anyone asking Ilya Sutskever?
| EvgeniyZh wrote:
| Attention, scaling laws, diffusion, vision transformers,
| Bert/Roberta, CLIP, chinchilla, chatgpt-related papers, nerf,
| flamingo, RETRO/some retrieval sota
| [deleted]
| seydor wrote:
| what do you mean 'scaling laws'?
| EvgeniyZh wrote:
| J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess,
| R. Child, S. Gray, A. Radford, J. Wu, and D. Amodei. Scaling
| laws for neural language models. arXiv preprint
| arXiv:2001.08361, 2020.
|
| and multiple follow-ups
| jimmySixDOF wrote:
| >90% of what matters today
|
| Strikes me as the kind of thing where that last 10% will need 400
| papers
| mindcrime wrote:
| _" The first 90% is easy. It's the second 90% that kills ya."_
| kabdib wrote:
| "All projects are divided into three phases, each consisting
| of 90% of the work."
|
| -- _just about everything I 've shipped :-)_
| michpoch wrote:
| For the last 10% you'll need to write a paper yourself.
| swyx wrote:
| maybe thats the part he intends to deviate. he just doesnt need
| to reinvent the settled science.
| tikhonj wrote:
| Along with the kind of details and tacit knowledge that never
| makes it into papers...
| sillysaurusx wrote:
| "The email including them got lost to Meta's two-year auto-delete
| policy by the time I went back to look for it last year. I have a
| binder with a lot of them printed out, but not all of them."
|
| RIP. If it's any consolation, it sounds like the list is at least
| three years old by now. Which is a long time considering that
| 2016 is generally regarded as the date of the deep learning
| revolution.
| pengaru wrote:
| > If it's any consolation, it sounds like the list is at least
| three years old by now.
|
| In my experience when it comes to learning technical subjects
| from a position of relative total ignorance, it's the older
| resources that are the easiest to bootstrap knowledge from.
| Then you basically work your way forward through the newer
| texts, like an accelerated replay of a domain's progress.
|
| I think it's kind of obvious that this would be the case when
| you think about it. Just like how history textbooks can't keep
| growing in size to give all past events an equal treatment, nor
| can technical references as a domain matures.
|
| You're forced to toss out stuff deemed least relevant to today,
| and in technical domains that's often stuff you've just started
| assuming as understood by the reader... where early editions of
| a new space would have prioritized getting the reader up to
| speed in something totally novel to the world.
| moglito wrote:
| "considering that 2016 is generally regarded as the date of the
| deep learning revolution" --
|
| I thought it was 2012, when AlexNet took the imagenet crown?
| sillysaurusx wrote:
| That's probably fair. But you'd be hard-pressed to find a DL
| stack to try out your ideas with prior to 2016, since that's
| when Tensorflow launched. :)
|
| (Gosh, it's been less than a decade. Time sometimes doesn't
| fly, considering how much it's changed the world since
| then...)
| abrichr wrote:
| Theano was first released in 2007.
| sillysaurusx wrote:
| That's actually fascinating. Were there many experiments
| done in it back in the 00's?
|
| I'm just trying to imagine the things you could do with
| it back then. 2007 had relatively fast gpus for the time,
| but certainly nothing compared to today. Yet it'd
| certainly be enough for MNIST training, which makes me
| wonder what else could be done.
| abrichr wrote:
| You can look at Yoshua Bengio's Google Scholar profile
| [1] and scroll down to see what they were working on
| around that time.
|
| Here are some papers with many citations:
|
| - An empirical evaluation of deep architectures on
| problems with many factors of variation [2]
|
| - Extracting and composing robust features with denoising
| autoencoders [3]
|
| - Scaling learning algorithms towards AI [4]
|
| [1] https://scholar.google.com/citations?hl=en&user=kukA0
| LcAAAAJ...
|
| [2] https://scholar.google.com/citations?view_op=view_cit
| ation&h...
|
| [3] https://scholar.google.com/citations?view_op=view_cit
| ation&h...
|
| [4] https://scholar.google.com/citations?view_op=view_cit
| ation&h...
| ladberg wrote:
| FWIW in 2016 I was at an ML team at Apple that had been
| shipping production neural networks on-device for a while
| already. At the everyone used an assortment of random
| tools (Theano, Torch, Caffe). I worked on an internal
| tool that originally started as a Theano fork but was
| closer to a modern-day Tensorflow XLA (and has since been
| axed in favor of Tensorflow for most teams).
| ilaksh wrote:
| My guess is that multimodal transformers will probably eventually
| get us most of the way there for general purpose AI.
|
| But AGI is one of those very ambiguous terms. For many people
| it's either an exact digital replica of human behavior that is
| alive, or something like a God. I think it should also apply to
| general purpose AI that can do most human tasks in a strictly
| guided way, although not have other characteristics of humans or
| animals. For that I think it can be built on advanced multimodal
| transformer-based architectures.
|
| For the other stuff, it's worth giving a passing glance to the
| fairly extensive amount of research that has been labeled AGI
| over the last decade or so. It's not really mainstream except
| maybe the last couple of years because really forward looking
| people tend to be marginalized including in academia.
|
| https://agi-conf.org
|
| Looking forward, my expectation is that things like memristors or
| other compute-in-memory will become very popular within say 2-5
| years (obviously total speculation since there are no products
| yet that I know of) and they will be vastly more efficient and
| powerful especially for AI. And there will be algorithms for
| general purpose AI possibly inspired by transformers or AGI
| research but tailored to the new particular compute-in-memory
| systems.
| TimPC wrote:
| Why do you think multimodal transformers will get us anywhere
| near general purpose AI? Multimodal transformers are basically
| a technology for sequence-to-sequence intelligent mappings and
| it seems to me extremely unlikely that general intelligence is
| one or more specific sequence-to-sequence mappings. Many
| specific purpose problems are sequence-to-sequence but these
| tend to be specialized functionalities operating in one or more
| specific domains.
| RC_ITR wrote:
| A lot of people don't really _get_ that our brains are a
| bunch of specialized subcomponents that work in concert (Your
| pre-frontal cortex just cannot beat your heart, not matter
| how optimized it gets). This is unsurprising, as our brains
| are one of the most complex /hard to monitor things on earth.
|
| When an artificial tool that is really a _point solution_
| "tricks" us into thinking it has replicated a task that
| requires complex multi-component functioning within our
| brain, we assume the tool is acting like our brain is acting.
|
| The joke of course being that if you maliciously edited GPT's
| index for translating vectors to words, it would produce
| gibberish and we wouldn't care (despite being the exact same
| core model).
|
| We are only impressed by the complex sequence to sequence
| strings it makes because the tokens happen to be words
| (arguable the most important things in our lives).
|
| EDIT: a great historic metaphor for this is how we thought
| about 'computer vision' and CNN's. They do great at
| identifying things in images, but notice that we still use
| image-based captcha's (Even on OpenAI sites no less!)?
|
| That's because it turns out optical illusions and context-
| heavy images are things that CNN's really struggle at (since
| the problem space is bigger than 'how are these pixels
| arranged')
| ilaksh wrote:
| A couple of things.
|
| 1) As I said, many people have different ideas of what we are
| talking about. I assume that for you general purpose AI has
| more capabilities, such as the ability to quickly learn tasks
| to a high level on the fly. For me, it still qualifies as
| general purpose if it can do most tasks but relies on a lot
| of pre-training and let's say knowledgebase look up.
|
| 2) It seems obvious to me that ChatGPT proves a general
| purpose utility for these types of LLMs, and it is easy to
| speculate that something similar but with visual input/output
| also will be even more general. And so we are just looking at
| a matter of degree by that definition.
| mirekrusin wrote:
| AGI will be AI which can improve it's own code after N
| iterations where N will be blurry.
| [deleted]
| [deleted]
___________________________________________________________________
(page generated 2023-02-03 23:02 UTC)