[HN Gopher] Richard Sutton and Andrew Barto Win 2024 Turing Award
       ___________________________________________________________________
        
       Richard Sutton and Andrew Barto Win 2024 Turing Award
        
       Author : camlinke
       Score  : 502 points
       Date   : 2025-03-05 10:03 UTC (1 days ago)
        
 (HTM) web link (awards.acm.org)
 (TXT) w3m dump (awards.acm.org)
        
       | rvz wrote:
       | Absolutely well deserved.
        
         | darosati wrote:
         | Hear hear
        
       | ofirpress wrote:
       | Good time to re-read The Bitter Lesson:
       | https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson...
        
         | khaledh wrote:
         | Indeed a bitter lesson. I once enjoyed encoding human knowledge
         | into a computer because it gives me understanding of what's
         | going on. Now everything is becoming a big black box that is
         | hard to reason about. /sigh/
         | 
         | Also, Moore's law has become a self-fulfilling prophecy. Now
         | more than ever, AI is putting a lot of demand on computational
         | power, to the point which drives chip makers to create
         | specialized hardware for it. It's becoming a flywheel.
        
           | anonzzzies wrote:
           | I am still hoping AI progress will get to the point where the
           | AI can eventually create AI's that are built up out of robust
           | and provable logic which can be read and audited. Until that
           | time, I wouldn't trust it for risky stuff. Unfortunately,
           | it's not my choice and within a scarily short timespan, black
           | boxes will make painfully wrong decisions about vital things
           | that will ruin lives.
        
             | tromp wrote:
             | AI assisted theorem provers will go a bit in that
             | direction. You may not know exactly how they managed to
             | construct a proof, but you can examine that proof in detail
             | and verify its correctness.
        
               | anonzzzies wrote:
               | Yes, I have a small team of (me being 1/3) doing formal
               | verification in my company and we do this and it doesn't
               | actually matter if how the AI got there; we can
               | mathematically say it's correct which is what matters. We
               | do (and did) program synthesis and proofs but this is all
               | very far from doing anything serious at scale.
        
               | InkCanon wrote:
               | What kind of company needs formal verification? Real time
               | systems?
        
               | anonzzzies wrote:
               | Real time / embedded / etc for money handling,
               | healthcare, aviation/transport... And 'needs' is a loaded
               | term; the biggest $ contributors to formal verification
               | progress are blockchain companies these days while a lot
               | of critical systems are badly written, outsourced things
               | that barely have tests.
               | 
               | My worst fear, which is happening because it works-ish,
               | is vague/fuzzy systems _being_ the software because it 's
               | so like humans and we don't have anything else. It's a
               | terrible idea, but of course we are in a hurry.
        
               | tasty_freeze wrote:
               | Companies designing digital circuits use it all the time.
               | 
               | Say you have a module written in VHDL or Verilog and it
               | is passing regressions and everyone is happy. But as the
               | author, you know the code is kind of a mess and you want
               | to refactor the logic. Yes, you can make your edits and
               | then run a few thousand directed tests and random
               | regressions and hope that any error you might have made
               | will be detected. Or you can use formal verification and
               | prove that the two versions of your source code are
               | functionally identical. And the kicker is it often takes
               | minutes to formally prove it, vs hundreds to thousands of
               | CPU hours to run a regression suite.
               | 
               | At some point the source code is mapped from a RTL
               | language to gates, and later those gates get mapped to a
               | mask set. The software to do that is complex and can have
               | bugs. The fix is to extract the netlist from the masks
               | and then formally verify that the extracted netlist
               | matches the original RTL source code.
               | 
               | If your code has assertions (and it should), formal
               | verification can be used to find counter examples that
               | disprove the assertion.
               | 
               | But there are limitations. Often logic is too complex and
               | the proof is bounded: it can show that from some initial
               | state no counter example can be found in, say, 18 cycles,
               | but there might be a bug that takes at least 20 cycles to
               | expose. Or it might find counter examples and you find it
               | arises only in illegal situations, so you have to
               | manually add constraints to tell it which input sequences
               | are legal (which often requires modeling the behavior of
               | the module, and that itself can have bugs...).
               | 
               | The formal verifiers that I'm familiar with are really a
               | collection of heuristic algorithms and a driver which
               | tries various approaches for a certain amount of time
               | before switching to a different algorithm to see if that
               | one can crack the nut. Often, when a certain part of the
               | design can be proven equivalent, it aids in making
               | further progress, so it is an iterative thing, not a
               | simple "try each one in turn". The frustrating thing is
               | you can run formal on a module and it will prove there
               | are no violations with a bounded depth of, say, 32
               | cycles. A week later a new release of your formal tool
               | comes out with bug fixes and enhancements. Great! And now
               | that module might have a proof depth of 22 cycles, even
               | though nothing changed in the design.
        
             | optimalsolver wrote:
             | >AI can eventually create AI's that are built up out of
             | robust and provable logic
             | 
             | That's the approach behind Max Tegmark and Steven
             | Omohundro's "Provably Safe AGI":
             | 
             | https://arxiv.org/abs/2309.01933
             | 
             | https://www.youtube.com/watch?v=YhMwkk6uOK8
             | 
             | However, there are issues. How do you even begin to
             | formalize concepts like human well-being?
        
               | anonzzzies wrote:
               | > However there are issues. How do you even begin to
               | formalize concepts like human well-being?
               | 
               | Oh agreed! But with AI we might(!) have the luxury to
               | create different types of brains; logically correct
               | brains for space flight, building structures (or at least
               | the calcuations), taxes, accounting, physics, math etc
               | and brains with feelings for many other things. Have
               | those cooperate.
               | 
               | ps. thanks for the links!
        
               | necovek wrote:
               | The only problem is that "logical correctness" depends on
               | the limits of human brain too: formal logic is based on
               | the usual pre-accepted assumptions and definitions
               | ("axioms").
               | 
               | This is what I consider the limit of the human mind: we
               | have to start with a few assumptions we can't "prove" to
               | build even a formal logic system which we then use to
               | build all the other provably correct systems, but we
               | still add other axioms to make them work.
               | 
               | It's hard for me to even think how AI can help with that.
        
             | fuzztester wrote:
             | Quis custodiet ipsos custodes?
             | 
             | https://en.m.wikipedia.org/wiki/Quis_custodiet_ipsos_custod
             | e...
             | 
             | excerpt of the first few paragraphs, sorry about any wrong
             | formatting, links becoming plain text, etc. just pasted it
             | as is:
             | 
             | Quis custodiet ipsos custodes? is a Latin phrase found in
             | the Satires (Satire VI, lines 347-348), a work of the
             | 1st-2nd century Roman poet Juvenal. It may be translated as
             | "Who will guard the guards themselves?" or "Who will watch
             | the watchmen?".
             | 
             | The original context deals with the problem of ensuring
             | marital fidelity, though the phrase is now commonly used
             | more generally to refer to the problem of controlling the
             | actions of persons in positions of power, an issue
             | discussed by Plato in the Republic.[citation needed] It is
             | not clear whether the phrase was written by Juvenal, or
             | whether the passage in which it appears was interpolated
             | into his works. Original context edit
             | 
             | The phrase, as it is normally quoted in Latin, comes from
             | the Satires of Juvenal, the 1st-2nd century Roman satirist.
             | Although in its modern usage the phrase has wide-reaching
             | applications to concepts such as tyrannical governments,
             | uncontrollably oppressive dictatorships, and police or
             | judicial corruption and overreach, in context within
             | Juvenal's poem it refers to the impossibility of enforcing
             | moral behaviour on women when the enforcers (custodes) are
             | corruptible (Satire 6, 346-348):
             | 
             | audio quid ueteres olim moneatis amici, "pone seram,
             | cohibe." sed quis custodiet ipsos custodes? cauta est et ab
             | illis incipit uxor.
             | 
             | I hear always the admonishment of my friends: "Bolt her in,
             | constrain her!" But who will watch the watchmen? The wife
             | plans ahead and begins with them!
        
               | gsf_emergency_2 wrote:
               | Apologies for taking the phrase in a slightly farcical (&
               | incurious ?) direction:                  Who will take
               | custody of the custodians?
        
           | amelius wrote:
           | Well, take compiler optimization for example. You can allow
           | your AI to use correctness-preserving transformations only.
           | This will give you correct output no matter how weird the AI
           | behaves.
           | 
           | The downside is that you will sometimes not get the
           | optimizations that you want. But, this is sort of already the
           | case, even with human made optimization algorithms.
        
         | cxr wrote:
         | Canonical URL:
         | <http://www.incompleteideas.net/IncIdeas/BitterLesson.html>
        
         | kleiba wrote:
         | This depends a little bit on what the goal of AI research is.
         | If it is (and it might well be) to build machines that excel at
         | tasks previously thought to be exclusively reserved to, or
         | needing to involve, the human mind, then these bitter lessons
         | are indeed worthwhile.
         | 
         | But if you do AI research with the idea that by teaching
         | machines how to do X, we might also be able to gain insight in
         | how people do X, then ever more complex statistical setups will
         | be of limited information.
         | 
         | Note that I'm not taking either point of view here. I just want
         | to point out that perhaps a more nuanced approach might be
         | called for here.
        
           | visarga wrote:
           | > if you do AI research with the idea that by teaching
           | machines how to do X, we might also be able to gain insight
           | in how people do X, then ever more complex statistical setups
           | will be of limited information
           | 
           | At the very least we know consistent language and vision
           | abilities don't require lived experience. That is huge in
           | itself, it was unexpected.
        
             | kleiba wrote:
             | Is that true though given e.g. the hallucinations you
             | regularly get from LLMs?
        
             | probably_wrong wrote:
             | > _At the very least we know consistent language and vision
             | abilities don 't require lived experience._
             | 
             | I don't think that's true. A good chunk of the progress
             | done in the last years is driven by investing thousand of
             | man-hours asking them "Our LLM failed at answering X. How
             | would you answer this question?". So there's definitely
             | some "lived experience by proxy" going on.
        
         | crabbone wrote:
         | I remember the article, and remember how badly it missed the
         | point... The goal of writing a chess program that could beat a
         | world champion wasn't to beat the world champion... the goal
         | was to gain understanding into how anyone can play chess well.
         | The victory in that match would've been equivalent to eg.
         | drugging Kasparov prior to the match, or putting a gun to his
         | head and telling him to lose: even cheaper and more effective.
        
           | krallistic wrote:
           | "The goal of Automated driving is not to drive automatically
           | but to understand how anyone can drive well"...
           | 
           | The goal of DeepBlue was to beat the human with a machine,
           | nothing more.
           | 
           | While the conquest of deeper understanding is used for a lot
           | of research, most AI (read modern DL) research is not about
           | understanding human intelligence, but automatic things we
           | could not do before. (Understanding human intelligence is
           | nowadays a different field)
        
         | DavidPiper wrote:
         | This describes Go AIs as a brute force strategy with no
         | heuristics, which is false as far as I know. Go AIs don't
         | search the entire sample space, they search based on their
         | training data of previous human games.
        
           | dfan wrote:
           | The paragraph on Go AI looked accurate to me. Go AI research
           | spent decades trying to incorporate human-written rules about
           | tactics and strategy. None of that is used any more, although
           | human knowledge is leveraged a bit in the strongest programs
           | when choosing useful features to feed into the neural nets.
           | (Strong) Go AIs are not trained on human games anymore.
           | Indeed they don't search the entire sample space when they
           | perform MCTS, but I don't see Sutton claiming that they do.
        
           | signa11 wrote:
           | > ... This describes Go AIs as a brute force strategy with no
           | heuristics ...
           | 
           | no, not really, from the paper
           | 
           | >> Also important was the use of learning by self play to
           | learn a value function (as it was in many other games and
           | even in chess, although learning did not play a big role in
           | the 1997 program that first beat a world champion). Learning
           | by self play, and learning in general, is like search in that
           | it enables massive computation to be brought to bear.
           | 
           | important notion here is, imho "learning by self play".
           | required heuristics emerge out of that. they are not
           | _programmed_ in.
        
           | HarHarVeryFunny wrote:
           | First there was AlphaGo, which had learnt from human games,
           | then further improved from self-play, then there was AlphaGo
           | Zero which taught itself from scratch just by self-play, not
           | using any human data at all.
           | 
           | Game programs like AlphaGo and AlphaZero (chess) are all
           | brute force at core - using MCTS (Monte Carlo Tree Search) to
           | project all potential branching game continuations many moves
           | ahead. Where the intelligence/heuristics comes to play is in
           | pruning away unpromising branches from this expanding tree to
           | keep the search space under control; this is done by using a
           | board evaluation function to assess the strength of a given
           | considered board position and assess if it is worth
           | continuing to evaluate that potential line of play.
           | 
           | In DeepBlue (old IBM "chess computer" that beat Kasparov) the
           | board evalation function was hand written using human chess
           | expertise. In modern neural-net based engines such as AlphaGo
           | and AlphaZero, the board evaluation function is learnt -
           | either from human games and/or from self-play, learning what
           | positions lead to winning outcomes.
           | 
           | So, not just brute force, but that (MCTS) is still the core
           | of the algorithm.
        
             | bubblyworld wrote:
             | This a somewhat uninteresting matter of semantics, but I
             | think brute force generally refers to exhaustive search.
             | MCTS is not brute force for that very reason (the vast
             | majority of branches are never searched at all).
        
               | HarHarVeryFunny wrote:
               | OK, but I think it's generally understood that exhaustive
               | search is not feasible for games like Chess and Go, so
               | when "brute force" is used in this context it means an
               | emphasis on deep search and number of positions evaluated
               | rather than the human approach where many orders of
               | magnitude less positions are evaluated.
        
               | bubblyworld wrote:
               | I think that kind of erodes the meaning of the phrase. A
               | typical MCTS run for alphazero would evaluate what, like
               | 1024 rollouts? Maybe less? That's a drop in the ocean
               | compared to the number of states available in chess. If
               | you call that brute force then basically everything is.
               | 
               | I've personally viewed well over a hundred thousand
               | rollouts in my training as a chess bot =P
        
             | visarga wrote:
             | > Game programs like AlphaGo and AlphaZero (chess) are all
             | brute force at core -
             | 
             | What do you call 2500 years of human game play if not brute
             | force? Cultural evolution took 300K years, quite a lot of
             | resources if you ask me.
        
               | beepbooptheory wrote:
               | Either you missed an /s or I am very interested to hear
               | you unpack this a little bit. If you are serious, it just
               | turns "brute force" into a kind of empty signifier
               | anyway.
               | 
               | What do you call the attraction of bodies if not love?
               | What is an insect if not a little human?
        
               | HarHarVeryFunny wrote:
               | That 2500 years of game play is reflected in chess theory
               | and book openings, what you might consider as pre-
               | training vs test time compute.
               | 
               | A human grandmaster might calculate 20-ply ahead, but
               | only for a very limited number of lines, unlike a
               | computer engine that may evaluate millions of positions
               | for each move.
               | 
               | Pattern matching vs search (brute force) is a trade off
               | in games like Chess and Go, and humans and MCTS-based
               | engines are at opposite ends of the spectrum.
        
         | perks_12 wrote:
         | The Bitter Lesson seems to be generally accepted knowledge in
         | the field. Wouldn't that make DeepSeek R1 even more of a
         | breakthrough?
        
           | currymj wrote:
           | that was "bitter lesson" in action.
           | 
           | for example there are clever ways of rewarding all the steps
           | of a reasoning process to train a network to "think". but
           | deepseek found these don't work as well as much simpler
           | yes/no feedback on examples of reasoning.
        
         | Buttons840 wrote:
         | Oof. Imagine the bitter lesson classical NLP practitioners
         | learned. That paper is as true today as ever.
        
         | jdright wrote:
         | > In computer vision, there has been a similar pattern. Early
         | methods conceived of vision as searching for edges, or
         | generalized cylinders, or in terms of SIFT features. But today
         | all this is discarded.Modern deep-learning neural networks use
         | only the notions of convolution and certain kinds of
         | invariances, and perform much better.
         | 
         | I was there, at that moment where pattern matching for vision
         | started to die. That was not completely lost though, learning
         | from that time is still useful on other places today.
        
           | abdullahkhalids wrote:
           | I was an undergrad interning in a computer vision lab in the
           | early 2010s. During group meeting, someone presented a new
           | paper that was using abstract machine learning like stuff to
           | do vision. The prof was so visibly perturbed and agnostic. He
           | could not believe that this approach was even a little bit
           | viable, when it so clearly was.
           | 
           | Best lesson for me - vowed never to be the person opposed to
           | new approaches that work.
        
             | kenjackson wrote:
             | > Best lesson for me - vowed never to be the person opposed
             | to new approaches that work.
             | 
             | I think you'll be surprised at how hard that will be to do.
             | The reason many people feel that way is because: (a)
             | they've become an expert (often recognized) in the old
             | approach. (b) They make significant money (or something
             | else).
             | 
             | At the end of the day, when a new approach greatly
             | encroaches into your way of life -- you'll likely push
             | back. Just think about the technology that you feel you
             | derive the most benefit from today. And then think if
             | tomorrow someone created something marginally better at its
             | core task, but for which you no longer reap any of the
             | rewards.
        
               | abdullahkhalids wrote:
               | Of course it is difficult, for precisely the reasons you
               | indicate. It's one of those lifetime skills that you have
               | to continuously polish, and if you fall behind it is
               | incredibly hard to recover. But such skills are necessary
               | for being a resilient person.
        
         | blufish wrote:
         | nice read and insightful
        
       | PartiallyTyped wrote:
       | This made my day! Well deserved!
        
       | darkoob12 wrote:
       | They should have given it to some physicists to make it even.
        
       | porridgeraisin wrote:
       | Their book "Introduction to Reinforcement Learning" is one of the
       | most accessible texts in the AI/ML field, highly recommend
       | reading it.
        
         | barrenko wrote:
         | I've tried descending down the RL branch, always seem way out
         | of my depth with those formulas and star-this, star-that.
        
           | porridgeraisin wrote:
           | Yeah, the formalisations can be hard to crunch through
           | (especially because of [1]). But this book in particular is
           | quite well laid out. I'd suggest getting a math background on
           | the (very) basics of "contraction mappings", as this is
           | something the book kind of assumes you have the knowledge of.
           | 
           | [1] There's a lot of confusing naming. For example, due to
           | its historic ties with behavioural psychology, there are a
           | bunch of things called "eligibility traces" and so on. Also,
           | even more than the usual "obscurity through notation" seen in
           | all of math and AI, early RL literature in particular has
           | particularly bad notation. You'd see the same letter mean
           | completely different things (sometimes even opposite!) in two
           | different papers.
        
         | zelphirkalt wrote:
         | You mean "Reinforcement Learning: An Introduction"? Or did they
         | write another one?
        
           | porridgeraisin wrote:
           | Yeah that one. Messed up the name.
        
         | incognito124 wrote:
         | What is your background? Unfortunately I did not find it very
         | accessible.
        
         | jxjnskkzxxhx wrote:
         | That book is a joy. Strong recommend.
        
       | ignoramous wrote:
       | Congratulations to Prof Barto & Prof Sutton. I'm sure the late
       | Harry Klopf is all smiles (:
       | 
       | > _The ACM A.M. Turing Award, often referred to as the "Nobel
       | Prize in Computing," carries a $1 million prize with financial
       | support provided by Google, Inc._
       | 
       | Good on Google, but there will be questions if their mere
       | sponsorship in _any_ way influences the awards.
       | 
       | If ACM wanted, could it not raise $1m prize money from non-
       | profits/trusts without much hassle?
        
       | j7ake wrote:
       | Amazing that Sutton (American) chooses to live in Edmonton, AB
       | rather than USA.
       | 
       | Shows he has integrity and is not a careerist focused on prestige
       | and money above all else.
        
         | Philpax wrote:
         | Keen is a fully remote outfit, so he can work wherever. It's
         | pretty likely that his reputation would open that door for him
         | no matter where he goes.
        
           | j7ake wrote:
           | At his level it is much more than just being able to do what
           | he wants, it's about attracting resources and talent to
           | accomplish his goals.
           | 
           | From that perspective location still matters if you want to
           | maximise impact
        
         | tbrockman wrote:
         | As someone who grew up in Edmonton, attended the U of A, and
         | had the good fortune of receiving an incredible CS education at
         | a discount price, I'm incredibly grateful for his (and the
         | other amazing professors there) immense sacrifice.
         | 
         | Great people and cheap cost of living, but man do I not miss
         | the city turning into brown sludge every winter.
        
         | jp57 wrote:
         | He's been there since he left Bell Labs, in the mid 2000's, I
         | think. The U of A is, or was, rich with Alberta oil sands money
         | and willing to use it to fund "curiosity-driven research",
         | which is pretty nice if you're willing to live where the
         | temperatures go down to -40 in the winter.
        
         | armSixtyFour wrote:
         | https://nationalpost.com/news/canada/ai-guru-rich-sutton-dee...
         | 
         | He gave up his US citizenship years ago but he explains some of
         | the reasons why he left. I'll also say that the AI research
         | coming out of Canada is pretty great as well so I think it
         | makes sense to do research there.
        
       | pklee wrote:
       | Very well deserved !! Amazing contributions !!
        
       | mark_l_watson wrote:
       | Nice! Well deserved. They make both editions of their RL textbook
       | available as a free to read PDF. I have been a paid AI
       | practitioner since 1982, and I must admit that RL is one subject
       | I personally struggle mastering, and the Sutton/Barto book, the
       | Cousera series on RL taught by Professors White and White, etc.
       | personally helped me: recommended!
       | 
       | EDIT: the example programs for their book are available in Common
       | Lisp and Python. http://incompleteideas.net/book/the-
       | book-2nd.html
        
       | zackkatz wrote:
       | Very cool to see this! It turns out my wife and I bought Andy
       | Barto's (and his wife's) house.
       | 
       | During the process, there was a bidding war. They said "make your
       | prime offer" so, knowing he was a mathematician, we made an offer
       | that was a prime number :-)
       | 
       | So neat to see him be recognized for his work.
        
         | HPMOR wrote:
         | This is a crazy story!! Hahaha wow. What was the prime number?
        
         | dustfinger wrote:
         | Ha haa, that is fantastic. You should have joked and said -
         | "I'd like to keep things even between us, how about $2?"
        
         | grumpopotamus wrote:
         | > we made an offer that was a prime number
         | 
         | $12345678910987654321?
        
       | optimalsolver wrote:
       | So 2025 really is the year of agents.
        
       | jimbohn wrote:
       | Well deserved, RL will only gain more importance as time goes on
       | thanks to its (and neural nets) flexibility. The bitter lesson
       | won't feel so bitter as we scale.
        
       | byyoung3 wrote:
       | they deserve it. definitely recommend their book
        
       | nextworddev wrote:
       | RL may prove to be the most important tech going fwd due to test
       | time compute
        
       | vicentwu wrote:
       | Great!
        
       | carabiner wrote:
       | Wonder if he's still working in AGI with Carmack.
        
       | cxie wrote:
       | Huge congratulations to Andrew Barto and Richard Sutton on the
       | well-deserved Turing Award! as a student, their textbook
       | Reinforcement Learning: An Introduction was my gateway into the
       | field. I still remembered that how Chapter 6 on 'Temporal
       | Difference Learning' fundamentally reshaped the way I thought
       | about sequential decision-making.
       | 
       | a timeless classic that I still highly recommend reading today!
        
       | vonneumannstan wrote:
       | Good time to remind everyone that Sutton is a human successionist
       | and doesn't care if humans all die. He is not to be trusted nor
       | celebrated: https://www.youtube.com/watch?v=NgHFMolXs3U
        
         | nycticorax wrote:
         | This is so silly. Do you imagine temporal difference learning
         | is some kind of human successionist plot?
        
           | vonneumannstan wrote:
           | The video is not about his technical work but rather his view
           | that AI will or should take over the future.
        
             | nycticorax wrote:
             | But the Turing Award _is_ for his technical work.
        
               | kalkin wrote:
               | Sure, and his other views - in the scope of his
               | professional expertise but also quite relevant to, uh,
               | other humans - seem relevant in an HN thread about the
               | Turing award. This place isn't exactly restricted to
               | technical discussion of the details of RL algorithms, and
               | it's pretty fair for humans to have views on whether we
               | ought to be replaced.
               | 
               | It's not just one Youtube video, it's a repeatedly
               | expressed view:
               | 
               | https://x.com/RichardSSutton/status/1575619655778983936
               | 
               | Valuing technological advance for its own sake "beyond
               | good and bad" is an admirably clear statement of how a
               | lot of researchers operate, but that's the best I can say
               | for it.
        
               | nycticorax wrote:
               | The statement I take issue with is that Sutton "is not to
               | be celebrated or trusted". Which I can only interpret to
               | mean that the speaker does not think that Sutton _should_
               | be celebrated or trusted. (And they 've chosen to state
               | it in a kind of pompous way.) Which I think is too strong
               | on both counts. I (and apparently the ACM) think that
               | Sutton should be celebrated for his technical
               | accomplishments. Also, I think he probably can be trusted
               | on a lot of technical matters. Should he be trusted on
               | matters of whether there need to be safeguards on AI
               | research imposed by the state? Maybe not, but those are
               | only a subset of all the matters.
        
         | Version467 wrote:
         | Very disappointing. I do not understand how people earnestly
         | defend the successionist view as a good future, but I thought
         | he might at least give some interesting arguments.
         | 
         | This talk isn't that. There are no substantive arguments for
         | why we should embrace this future and his representation of the
         | opposite side isn't in good faith either, instead he chose to
         | present straw-man versions of them.
         | 
         | He concludes with "A successful succession offers [...] the
         | best hope for a long-term future for humanity. How this can
         | possibly be true when ai succession necessarily includes
         | replacement eludes me. He does mention transhumanism on a
         | slide, but it seems extremely unlikely that he's actually
         | talking about that and the whole succession spiel is just
         | unfortunate wording.
        
           | visarga wrote:
           | > ai succession necessarily includes replacement
           | 
           | How is AI going to make its own chips and energy? The supply
           | chain for AI hardware is long an fragile. AGI will have an
           | interest in maintaining peace for this reason.
           | 
           | And why would it replace us, our thoughts are like food for
           | AI. Our bodies are very efficient and mobile, biology will
           | certainly be an option for AGI at some point.
        
             | vonneumannstan wrote:
             | Robotics is a software problem now, see the Tesla, Figure
             | or Unitree humanoid bots. An AI can be totally embodied and
             | humans will have little or no value as labor at all.
        
             | hollerith wrote:
             | >How is AI going to make its own chips and energy?
             | 
             | OK, so do you support laws preventing chip manufacturers
             | and energy providers from becoming reliant on AI?
        
             | drcode wrote:
             | > How is AI going to make its own chips and energy?
             | 
             | Pay naive humans take care of those things while it has to,
             | then disassemble the atoms in their human bodies into raw
             | materials for robots/datacenters once that is no longer
             | necessary
        
         | visarga wrote:
         | I think he is trying to take the positive side of what is
         | probably an inetability.
        
           | vonneumannstan wrote:
           | Or we could just you know, not build the thing that will
           | probably kill us all and at minimum will obsolete all our
           | labor value.
        
             | jedberg wrote:
             | Given that strategy has never worked in the history of the
             | world, it's probably a good time to figure out how we will
             | put the right guardrails in place and how we will adjust to
             | the new normal.
             | 
             | If "we" don't build it, someone else will.
        
         | textlapse wrote:
         | The ACM award is for their professional academic achievements -
         | this fetishism to dig into another person's personal life and
         | find the most weird thing they said as the thing that paints
         | over all of their life's achievements as evil must stop.
         | 
         | It's silly and dangerous. Because you don't like thing A and
         | they said/did thing A all of their lofty accomplishments get
         | nullified by anyone. And worst of all internet gives your
         | opinion the same weight as someone else (or the rest of us) who
         | knows a lot about thing B that could change the world. From a
         | strictly professional capacity.
         | 
         | This works me up because this is what's dividing up people
         | right now at a much larger scale.
         | 
         | I wish you well.
        
           | vonneumannstan wrote:
           | >this fetishism to dig into another person's personal life
           | and find the most weird thing they said as the thing that
           | paints over all of their life's achievements as evil must
           | stop.
           | 
           | This has nothing to do with his professional life. He has
           | made these comments in a professional capacity at an industry
           | AI conference... The rest of your comment is a total non
           | sequitur.
           | 
           | >And worst of all internet gives your opinion the same weight
           | as someone else (or the rest of us) who knows a lot about
           | thing B that could change the world. From a strictly
           | professional capacity.
           | 
           | I've worked professionally in the ML field for 7 years so
           | don't try some appeal to authority bs on me. Geoff Hinton,
           | Yoshua Bengio, Demis Hassabis, Dario Amodei and countless
           | other leaders in the field all recognize and highlight the
           | possible dangers of this technology.
        
             | tsunego wrote:
             | > This has nothing to do with his professional life.
             | 
             | you mean his personal life?
        
               | vonneumannstan wrote:
               | oops, yes.
        
             | textlapse wrote:
             | It just feels like a smear on his character: Imagine
             | working on RL incrementally without any lofty goals or
             | preconceived evil.
             | 
             | I do agree that there is some level of inherent safety
             | issues with such technologies - but look at atomic bomb vs
             | fission reactors etc: history paves a way through
             | positivity.
             | 
             | Just because someone had an idea that eventually turned to
             | have some evil branch off way further from the root idea
             | doesn't mean they started with the evil idea in the first
             | place or worse, someone else won't.
        
               | hollerith wrote:
               | People left careers in AI in the 1990s because they came
               | to realize that the tech would probably eventually become
               | dangerous. Many more (including the star student in my CS
               | program in the 1980s) never started a career in AI for
               | the same reason.
               | 
               | Sutton and everyone else who has advanced the field
               | deserve condemnation IMO, not awards.
        
           | jffhn wrote:
           | >the most weird thing they said
           | 
           | Reminds me of a quote from Jean Cocteau, of which I could not
           | find the exact words, but which roughly says that if the
           | public knew what thoughts geniuses can have, it would be more
           | terrified than admiring.
        
           | kalkin wrote:
           | > all of their lofty accomplishments get nullified by anyone
           | 
           | I don't think it's a question of whether their achievements
           | are nullified, but as you mention, how to weight the opinions
           | of various people. Personally, I think both a Turing award
           | for technical achievement and a view that humanity ought to
           | be replaced are relevant in evaluating someone's opinions on
           | AI policy, and we shouldn't forget the latter because of the
           | former.
           | 
           | (Also, this isn't about Sutton's personal life - that's a
           | pretty bad strawman.)
        
             | h8hawk wrote:
             | By "view that humanity," do you mean alignment with the
             | effective altruism cult?
             | 
             | Repressive laws on open AI/models--giving elites total
             | control in the name of safety?
             | 
             | And this alternative perspective from the cult should
             | disqualify someone from a Turing Award despite their
             | achievements?
        
               | kalkin wrote:
               | No, a "view that humanity ought to be replaced" is
               | Sutton's, not an EA view. I'm not quite sure how you read
               | that otherwise, except that you seem very angry. I sure
               | hope our alternatives are better than human extinction or
               | total control by elites...
        
         | ks2048 wrote:
         | At least his Twitter profile no longer has the bitcoin-meme-
         | red-eyes thing.
        
         | 317070 wrote:
         | Have you ever met Sutton? He is the most heart-warming, caring
         | and passionate hippy I have ever met. He does not want all
         | humans to die. The talk you link also doesn't support your
         | claim. Perhaps I missed it, in that case, do leave a timestamp.
         | 
         | In the talk, he says it will lead to an era of prosperity for
         | humanity, however without humanity being in sole control of
         | their destiny. His conclusion slide (at 12:33) literally has
         | the bullet point "the best hope for a long-term future for
         | humanity". That is opposite to you saying he "doesn't care if
         | humans all die".
         | 
         | If I plan for my succession, I don't hope nor expect my
         | daughter will murder me. I'm hoping for a long retirement in
         | good health after which I will quietly pass in my sleep,
         | knowing I left her as well as I could in a symbiotic
         | relationship with the universe.
        
           | vonneumannstan wrote:
           | Here's the difference, you are not personally building the
           | device which will cause your demise and your succession. We
           | as humanity ARE doing that and have agency to choose NOT to
           | do that.
        
         | smokel wrote:
         | It is interesting that you bring this to the attention, but I
         | don't see why we should not trust or celebrate someone if they
         | have views that you don't agree with.
         | 
         | Edit: especially since I think your implied claim that Sutton
         | would actively want everyone to die seems very much unfounded.
        
         | zoogeny wrote:
         | > doesn't care if humans all die
         | 
         | That seems to be a harsh and misleading framing of his
         | position. My own reading is that he believes it is inevitable
         | that humans will be replaced by transhumans. That seems more
         | like wild sci-fi utopianism than ill-will. It doesn't seem like
         | a reason to avoid celebrating his academic achievements.
        
       | rhema wrote:
       | I used their RL book for a course I taught. It's beautifully
       | written and freely available
       | (http://incompleteideas.net/book/the-book-2nd.html)! I kept
       | getting distracted by the beautiful writing that I would miss the
       | actual content.
        
       | textlapse wrote:
       | This is a long time coming. To see through an idea from start to
       | finish and make this span an entire field instead of a sub
       | chapter in a dynamic programming book.
       | 
       | I wish a lot more games actually ended up using RL - the place
       | where all of this started in the first place - would be really
       | cool!
        
       | jamesblonde wrote:
       | Built a lot of my PhD on their work 20 years ago. It really stood
       | the test of time.
        
       | wegfawefgawefg wrote:
       | These guys are great but unfortunately the ai sutton and barto
       | book is really bad. You would do better with Grokking Machine
       | Learning by trask, and then a couple months of implementing ml
       | papers.
        
         | Buttons840 wrote:
         | I second this suggestion. Read Grokking Deep Reinforcement
         | Learning before reading Sutton. Well, the Sutton book is free,
         | so take a peak, but if the formulas scare you then read
         | Grokking Deep Reinforcement Learning.
        
         | 317070 wrote:
         | These books are about different topics? Sutton and Barto is
         | about Reinforcement learning, and the other book you mention by
         | Trask is on Deep Learning?
        
       ___________________________________________________________________
       (page generated 2025-03-06 23:02 UTC)