[HN Gopher] EURISKO Lives
       ___________________________________________________________________
        
       EURISKO Lives
        
       Author : wodow
       Score  : 112 points
       Date   : 2024-04-23 03:50 UTC (19 hours ago)
        
 (HTM) web link (blog.funcall.org)
 (TXT) w3m dump (blog.funcall.org)
        
       | whartung wrote:
       | The confluence of happenstance that occurs to make this a reality
       | is pretty amazing to witness.
       | 
       | Unfortunately it starts with the passing of Douglas Lenat. But
       | that enabled Stanford to open up their 40 year old archive, which
       | they still had, of Lenats work.
       | 
       | Somehow, someway, someone not only stumbled upon EURISKO, but
       | also knew what it was. One of the most notorious AI research
       | projects of the age that actually broke out of the research labs
       | of Stanford and out into the public eye, with impactful results.
       | Granted, for arguably small values of "public" and "impactful",
       | but for the small community it affected, it made a big splash.
       | 
       | Lenat used EURISKO to find a very unconventional winning
       | configuration to go on to win a national gaming tournament.
       | Twice.
       | 
       | In that community, it was a big deal. The publisher changed the
       | rules because of it, but Lenat returned victorious again the next
       | year. After a discussion with the game and tournament sponsors,
       | he never came back.
       | 
       | Apparently EURISKO has quite a reputation in the symbolic AI
       | world, but even there it was held close.
       | 
       | But now it has been made available. Not only made available, but
       | made operational. EURISKO is written in an obsolete Lisp dialect,
       | Interlisp. But, coincidentally, we have today machine simulators
       | that can run versions of that Lisp on long lost, 40 year
       | machines.
       | 
       | And someone was able to port it. And it seems to run.
       | 
       | The thought of the tendrils through time that had to twist their
       | way for us to get here leaves, at least me, awestruck. So much
       | opportunity for the wrong butterfly to have been stepped on to
       | prevent this from happening.
       | 
       | But it didn't, and here we are. Great job by the spelunkers who
       | dug this up.
        
         | jsnell wrote:
         | Enough of the Traveller tournament story is dodgy and
         | inconsistent enough that it's very hard to say what actually
         | happened beyond Lenat winning the tournament twice in a row
         | with some kind of computer assistance,
         | 
         | Basically, with the Traveller tournament Lenat appears to have
         | stumbled onto a story that caught the public's imagination, and
         | then through the milked it for all he could to give his project
         | publicity and to make it appear more successful than it
         | actually was. And if that required embellishing the story or
         | just making shit up, well, no harm no foul.
         | 
         | Even when something is technically true, it often turns out
         | that it's being told in a misleading way. For example, you say
         | that "the publisher changed the ruleset". That was the entire
         | gimmick of the Traveller TCS tournament rules! The printed
         | rulebook had a preset progression of tournament rules for each
         | year.
         | 
         | I wrote a bit more about this a few years ago with some of the
         | other details: https://news.ycombinator.com/item?id=28344379
        
       | peheje wrote:
       | What is it?
        
         | emmelaich wrote:
         | _Eurisko (Gr., I discover) is a discovery system written by
         | Douglas Lenat in RLL-1, a representation language itself
         | written in the Lisp programming language. A sequel to Automated
         | Mathematician, it consists of heuristics, i.e. rules of thumb,
         | including heuristics describing how to use and change its own
         | heuristics_
         | 
         | - https://en.wikipedia.org/wiki/Eurisko
        
           | KineticLensman wrote:
           | It got a lot of kudos for winning a multi player naval
           | wargame by building a bizarre but successful fleet that
           | exploited all the loopholes and quirks in the rules.
        
             | actionfromafar wrote:
             | Didn't it build a swarm of tiny boats? That loophole seems
             | to be currently exploited in the real world, too.
        
               | KineticLensman wrote:
               | IIRC it had at least one small (?) purely defensive boat
               | that couldn't be destroyed by typical weapons so its
               | parent fleet couldn't be defeated. It wasn't like a
               | modern drone swarm
        
               | PaulHoule wrote:
               | It makes me think of the battles in Doc Smith's _Lensman_
               | series where the Galactic Patrol would develop a game-
               | breaking fleet formation to use against Boskone in every
               | major naval battle.
        
               | KineticLensman wrote:
               | Ah yes, probably need to re-read them. I remember how
               | virtually every book introduces a new super weapon that
               | becomes part of the standard arsenal in the next, all the
               | way up to entire 'negative matter' planets that are fired
               | out of subspace (can't recall the in-universe name) at
               | the planets hosting your opponent's base.
        
               | PaulHoule wrote:
               | These are all on the Canadian Gutenberg.
        
             | pfdietz wrote:
             | Traveller Trillion Credit Squadron
             | 
             | Traveller was (and is) a space-based RPG, although the
             | original publisher is long out of business.
             | 
             | https://en.wikipedia.org/wiki/Traveller_(role-playing_game)
        
               | pfdietz wrote:
               | TCS itself: https://www.mongoosepublishing.com/products/a
               | dventure-5-tril...
        
           | boffinAudio wrote:
           | Can anyone give a clear example of how this can be used
           | productively? Its description doesn't help much.
           | 
           | What can one _do_ with EURISKO? The fact of its recovery
           | after its authors passing is interesting, in and of itself -
           | but why is EURISKO, specifically, worth the effort of
           | understanding?
        
             | pamoroso wrote:
             | This post also explains what EURISKO does by putting it
             | into the context of the later system Doug Lenat worked on,
             | Cyc https://outsiderart.substack.com/p/cyc-historys-
             | forgotten-ai...
        
             | isaacfrond wrote:
             | Excellent question. I have no clue either.
        
       | downvotetruth wrote:
       | dupe https://news.ycombinator.com/item?id=40095121
       | 
       | 4 months ago: https://news.ycombinator.com/item?id=38413615
        
       | varjag wrote:
       | I shall correct belatedly, the heuristic I point at after IsA is
       | not in fact not-IsA. Also, the system runs out of stack space not
       | of heap space.
        
       | thom wrote:
       | Up until about GPT 2, EURISKO was arguably the most interesting
       | achievement in AI. Back in the day on the SL4 and singularitarian
       | mailing lists, it was spoken of in reverent tones, and I'm sure I
       | remember a much younger Eliezer Yudkowsky cautioning that Doug
       | Lenat should have perceived a non-zero chance of hard takeoff at
       | the moment of its birth. I suspect its achievements were slightly
       | overblown and heavily guided by a human hand, but it's still
       | fascinating and definitely worthy of study. Genetic programming
       | hasn't yielded many interesting results since, and the
       | unreasonably effectiveness of differentiable programming and
       | backpropagation has sucked up much of the oxygen in the room. But
       | not everything is differentiable, the combination of the two
       | still seems worth investigating, and EURISKO goes to show the
       | power of heuristic approaches to some problems.
        
         | cabalamat wrote:
         | > Up until about GPT 2, EURISKO was arguably the most
         | interesting achievement in AI.
         | 
         | I agree.
         | 
         | > I suspect its achievements were slightly overblown and
         | heavily guided by a human hand
         | 
         | So do I. We'll find out how much of its performance was real,
         | and how much bullshit.
         | 
         | > the unreasonably effectiveness of differentiable programming
         | and backpropagation has sucked up much of the oxygen in the
         | room
         | 
         | The Bitter Lesson --
         | http://www.incompleteideas.net/IncIdeas/BitterLesson.html
        
         | Animats wrote:
         | Not really. Read [1], which references "Why AM and Eurisko
         | appear to work". There's a reason that line of development did
         | not continue.
         | 
         | [1] https://news.ycombinator.com/item?id=28343118
        
         | radomir_cernoch wrote:
         | > Up until about GPT 2, EURISKO was arguably the most
         | interesting achievement in AI.
         | 
         | I'm really baffled by such statement and genuinely curious.
         | 
         | How come that studying GOFAI as undergraduate and graduate at
         | many European universities, doing a PhD. and working in the
         | field for several years _never_ exposed me to EURISKO up until
         | last week (thanks to HN)?
         | 
         | I heard about Cyc, many formalism and algorithms that related
         | to EURISKO, but never heard of its name.
         | 
         | Is EURISKO famous in US only?
        
           | radomir_cernoch wrote:
           | For that reason, a comparison between GPT 2 and EURISKO seems
           | funny to me.
           | 
           | I discussed ChatGPT with my yoga teacher recently, but I bet
           | not even my IT colleagues would have a clue about EURISKO.
           | :-)
        
             | Phiwise_ wrote:
             | So? There's a real possibility DART has still saved its
             | customers more money over its lifetime than GPT has, and
             | odds are basically 100% that your yoga teacher and IT
             | colleagues haven't heard a thing about it either. The
             | general public has all sorts of wrong impressions and
             | unknown unknowns of facts that I don't see why they should
             | ever be used as a technology industry benchmark by anyone
             | not working in the UI department of a smartphone vendor.
        
           | rjsw wrote:
           | > Is EURISKO famous in US only?
           | 
           | It was featured in a BBC radio series on AI made by Colin
           | Blakemore [1] around 1980, the papers on AM and EURISKO were
           | in the library of the UK university that I attended.
           | 
           | [1] https://en.wikipedia.org/wiki/Colin_Blakemore#Public_enga
           | gem...
        
         | craigus wrote:
         | "... I'm sure I remember a much younger Eliezer Yudkowsky
         | cautioning that Doug Lenat should have perceived a non-zero
         | chance of hard takeoff at the moment of its birth."
         | 
         | https://www.lesswrong.com/posts/rJLviHqJMTy8WQkow/recursion-...
        
           | cabalamat wrote:
           | Also, in 2009 someone suggested re-implementing Eurisko[1],
           | and Yudkowsky cautioned against it:
           | 
           | > This is a road that does not lead to Friendly AI, only to
           | AGI. I doubt this has anything to do with Lenat's motives -
           | but I'm glad the source code isn't published and I don't
           | think you'd be doing a service to the human species by trying
           | to reimplement it.
           | 
           | To my mind -- and maybe this is just the benefit of hindsight
           | -- this seems way too overcautious on Yudkowsky's part.
           | 
           | [1]: https://www.lesswrong.com/posts/t47TeAbBYxYgqDGQT/let-s-
           | reim...
        
             | randallsquared wrote:
             | Machinery can be a lot simpler than biology. Birds are
             | incredibly complex systems: wing structure, musculature,
             | feathers, etc. An airplane can be a vaguely wing-shaped
             | piece of metal and a pulse jet. It doesn't seem super
             | implausible that there is some algorithm that is to human
             | consciousness what a pulse jet with wings is to a bird.
             | Maybe LLMs are that, but maybe they're far more than is
             | really needed because we don't yet know what we are doing.
             | 
             | I would bet against it being possible to implement
             | consciousness on a PDP, but I wouldn't be very confident
             | about it.
        
         | api wrote:
         | > a much younger Eliezer Yudkowsky cautioning that Doug Lenat
         | should have perceived a non-zero chance of hard takeoff at the
         | moment of its birth
         | 
         | Why is Yudkowsky taken seriously? This stuff is comparable to
         | the "LHC micro black holes will destroy Earth" hysteria.
         | 
         | There are actual concerns around AI like deep fakes, a deluge
         | of un-filterable spam, mass manipulation via industrial scale
         | propaganda, mass unemployment created by widespread automation
         | leading to civil unrest, opaque AIs making judgements that
         | can't be evaluated properly, AI as a means of mass
         | appropriation of work and copyright violation, concentration of
         | power in large AI companies, etc. The crackpot "hard takeoff"
         | hysteria only distracts from reasonable discourse about these
         | risks and how to mitigate them.
        
           | TeMPOraL wrote:
           | > _Why is Yudkowsky taken seriously?_
           | Trivialities    Annoyances    Immediate harm     X-Risk
           | |------------------------------------------------------|
           | \----stuff you mention-------/
           | \---stuff Eliezer------/
           | wrote about
           | 
           | > _The crackpot "hard takeoff" hysteria only distracts from
           | reasonable discourse about these risks and how to mitigate
           | them._
           | 
           | IDK, I feel endless hand-wringing about copyright and
           | deepfakes distract from risks of actual, significant harm at
           | scale, some of which you also mentioned.
        
           | thom wrote:
           | Perhaps we can disagree on the shape of the curve, but it
           | seems likely that ever more capable AI will enable ever more
           | serious harms. Absolutely true that we should counter those
           | harms in the present and not fixate on a theoretical future,
           | but the medicine is much the same either way.
        
           | adw wrote:
           | > Why is Yudkowsky taken seriously?
           | 
           | People like religion, particularly if it doesn't affect how
           | they live their life _today_ too much. You get all of the
           | emotional benefits of feeling like you're doing something
           | virtuous without the effort of actually performing good
           | works.
        
           | rvba wrote:
           | > "LHC micro black holes will destroy Earth" hysteria.
           | 
           | I will be heavily downvoted for this, but here is how I
           | remember it:
           | 
           | 1) LHC was used to study blackholes and prove things like
           | Hawking radiation
           | 
           | 2) LHC was supposed to be safe due to Hawking radiation (that
           | was only an unproven theory at the time)
           | 
           | So the unpopular question: what if Hawking radiation didnt
           | actually exist? Wouldnt there be a risk of us dying? A small
           | risk, but still some risk? (especially as the potential micro
           | black hole would have the same velocity as earth, so it
           | wouldnt fly away somewhere into space)
           | 
           | On a side note: how would EURISCO evaluate this topic?
           | 
           | Since I read about this secretive CYC (why u can email asking
           | for it, but source not hosted anywhere?): couldnt any current
           | statistics based AI be used to feed this CYC program /
           | database with information? Take a dictionary and ask ChatGPT
           | to fill it with information for each word.
        
             | api wrote:
             | The fundamental reason that hysteria was silly is that
             | Earth is bombarded by cosmic rays that are far stronger
             | than anything done in the LHC. The reason we built the LHC
             | is so we can do observable repeatable experiments at high
             | energies, not to reach energies never reached on Earth
             | before.
             | 
             | The AI hysteria I'm talking about here is the "foom"
             | hysteria, the idea that a sufficiently powerful model will
             | start self-improving without bound and become some kind of
             | AI super-god. That's about as wild as the LHC will make a
             | black hole that will implode the Earth. There are
             | fundamental reasons to believe it's impossible, such as the
             | question of "where would the information come from to drive
             | that runaway intelligence explosion?"
             | 
             | There are legitimate risks with AI, but not because AI is
             | somehow special and magical. All technologies have risks.
             | If you make a sharper stick, someone will stab someone with
             | it. Someday we may make a stick so sharp it stabs the
             | entire world (cue 50s sci-fi theremin music).
             | 
             | Edit: for example... I would argue that the Internet itself
             | has X-risks. The Internet creates an environment that
             | incentivizes an arms race for attention grabbing, and the
             | most effective strategies usually rely on triggering
             | negative emotions and increasing division. This could run
             | away to the point that it drives, say, civilizational
             | collapse or a global thermonuclear war. Does this mean it
             | would have been right to ban the Internet or require strict
             | licensing to place any new system online?
        
         | lisper wrote:
         | > the combination of the two still seems worth investigating
         | 
         | This.
         | 
         | Back in the late 1980's and early 90's the debate-du-jour was
         | between deliberative and reactive control systems for robots. I
         | got my Ph.D. for simply saying that the entire debate was based
         | on the false premise that it had to be one or the other, that
         | each approach had its strengths and weaknesses, and that if you
         | just put the two together the whole would be greater than the
         | sum of its parts. (Well, it was a little more than that. I had
         | to actually show that it worked, which was more work that
         | simply advancing the hypothesis, but in retrospect it seems
         | kinda obvious, doesn't it?)
         | 
         | If I were still in the game today, combining generative-AI and
         | old-school symbolic reasoning (which has also advanced a lot in
         | 30 years) would be the first thing I would focus my attention
         | (!) on.
        
           | adw wrote:
           | People have advanced that argument a lot, and it's often
           | worked for a short while; then the statistical models get
           | better.
           | 
           | Chess was a game for humans.
           | 
           | It was very briefly a game for humans and machines (Kasparov
           | had a go at getting "Advanced Chess" off the ground as a
           | competitive sport), but soon enough having a human in the
           | team made the program worse.
           | 
           | But at least the evaluation functions were designed by
           | humans, right? That lasted a remarkably long time; first
           | Stockfish became the strongest engine in the world by using
           | distributed hyperparameter search to tweak its piece-square
           | tables, then AlphaZero came along and used a policy network +
           | MCTS instead of alpha-beta search, then (with an assist from
           | the Shogi community) Stockfish struck back with a completely
           | learned evaluation function via NNUE.
           | 
           | So the last frontier of human expertise in chess is search
           | heuristics, and that's going to fall too:
           | https://arxiv.org/abs/2402.04494.
           | 
           | The common theme with all of this is that the stuff which we
           | used before are, fundamentally, hacks to get around _not
           | having enough compute_, but which make the system worse once
           | you don't have to make those tradeoffs around inductive
           | biases. Empirical evidence suggests that raw scaling has a
           | long way to run yet.
        
             | YeGoblynQueenne wrote:
             | That's the "bitter lesson", right? Which is really a sour
             | lesson- as in sour grapes. See, Rich Sutton's point with
             | his Bitter Lesson is that encoding expert knowledge only
             | improves performance temporarily, which is eventually
             | surpassed by more data and compute.
             | 
             | There are only two problems with this: One, statistical
             | machine learning systems have an extremely limited ability
             | to encode expert knowledge. The language of continuous
             | functions is alien to most humans and it's very difficult
             | to encode one's intuitive, common sense knowledge into a
             | system using that language [1]. That's what I mean when I
             | say "sour grapes". Statistical machine learning folks can't
             | use expert knowledge very well, so they pretend it's not
             | needed.
             | 
             | Two, all the loud successes of statistical machine learning
             | in the last couple of decades are closely tied to minutely
             | specialised neural net architectures: CNNs for image
             | classification, LSTMs for translation, Transformers for
             | language, Difussion models and Ganns for image generation.
             | If that's not encoding knowledge of a domain, what is?
             | 
             | Three, because of course three, despite point number two,
             | performance keeps increasing only as data and compute
             | increases. That's because the minutely specialised
             | architectures in point number two are inefficient as all
             | hell; the result of not having a good way to encode expert
             | knowledge. Statistical machine learning folk make a virtue
             | out of necessity and pretend that only being able to
             | increase performance by increasing resources is some kind
             | of achievement, whereas it's exactly the opposite: it is a
             | clear demonstration that the capabilities of systems are
             | not improving [2]. If capabilities were improving, we
             | should see the number of examples required to train a
             | state-of-the-art system either staying the same, or going
             | down. Well, it ain't.
             | 
             | Of course the neural net [community] will complain that
             | their systems have reached heights never before seen in
             | classical AI, but that's an argument that can only be
             | sustained by the ignorance of the continued progress in all
             | the classical AI subjects such as planning and scheduling,
             | SAT solving, verification, automated theorem proving and so
             | on.
             | 
             | For example, and since planning is high on my priorities
             | these days, see this video where the latest achievements in
             | planning are discussed (from 2017).
             | 
             | https://youtu.be/g3lc8BxTPiU?si=LjoFITSI5sfRFjZI
             | 
             | See particularly around this point where he starts talking
             | about the Rollout IW(1) symbolic planning algorithm that
             | plays Atari from screen pixels with performance comparable
             | to Deep-RL; except it does so _online_ (i.e. no training,
             | just reasoning on the fly):
             | 
             | https://youtu.be/g3lc8BxTPiU?si=33XSM6yK9hOlZJnf&t=1387
             | 
             | Bitter lesson my sweet little ass.
             | 
             | ____________
             | 
             | [1] Gotta find where this paper was but none other than
             | Vladimir Vapnik basically demonstrated this by trying the
             | maddest experiment I've ever seen in machine learning:
             | using poetry to improve a vision classifier. It didn't
             | work. He's spent the last 20 years trying to find a good
             | way to encode human knowledge into continuous functions. It
             | doesn't work.
             | 
             | [2] In particular their capability for inductive
             | generalisation which remains absolutely crap.
        
               | og_kalu wrote:
               | >Two, all the loud successes of statistical machine
               | learning in the last couple of decades are closely tied
               | to minutely specialised neural net architectures: CNNs
               | for image classification, LSTMs for translation,
               | Transformers for vision, Difussion models and Ganns for
               | image generation. If that's not encoding knowledge of a
               | domain, what is?
               | 
               | Transformers, Diffusion for Vision, Image generation are
               | really odd examples here. None of those architectures or
               | training processes are tuned for Vision in mind lol. It
               | was what? 3 years after Attention 2017 before the famous
               | Vit paper. CNNs have lost a lot of favor to Vits, LSTMs
               | are not the best performing translators today.
               | 
               | The bitter lesson is that less encoding of "expert"
               | knowledge results in better performance and this has
               | absolutely held up. The "encoding of knowledge" you call
               | these architectures is nowhere near that of the GOFAI
               | kind and even more than that, less biased NN
               | architectures seem to be winning out.
               | 
               | >That's because the minutely specialised architectures in
               | point number two are inefficient as all hell; the result
               | of not having a good way to encode expert knowledge.
               | 
               | Inefficient is a whole lot better than can't even play
               | the game, the story of GOFAI for the last few decades.
               | 
               | >If capabilities were improving, we should see the number
               | of examples required to train a state-of-the-art system
               | either staying the same, or going down. Well, they ain't.
               | 
               | The capabilities of models are certainly increasing. Even
               | your example is blatantly wrong. Do you realize how much
               | more data and compute it would take to train a Vanilla
               | RNN to say GPT-3 level performance?
        
               | YeGoblynQueenne wrote:
               | >> Inefficient is a whole lot better than can't even play
               | the game, the story of GOFAI for the last few decades.
               | 
               | See e.g. my link above where GOFAI plays the game (Atari)
               | very well indeed.
               | 
               | Also see Watson winning Jeopardy (a hybrid system, but
               | mainly GOFAI - using frames and Prolog for knowledge
               | extraction, encoding and retrieval).
               | 
               | And Deep Blue beating Kasparov. And MCTS still the SOTA
               | search algo in Go etc.
               | 
               | And EURISCO playing Traveller as above.
               | 
               | And Pluribus playing Poker with expert game-playing
               | knowledge.
               | 
               | And the recent neuro-symbolic DeepMind thingy that solves
               | geometry problems from the maths olympiad.
               | 
               | etc. etc. [Gonna stop editing and adding more as they
               | come to my mind here.]
               | 
               | And that's just playing games. As I say in my comment
               | above planning and scheduling, SAT, constraints,
               | verification, theorem proving- those are still dominated
               | by classical systems and neural nets suck at them. Ask
               | Yan LeCun: "Machine learning sucks". He means it sucks in
               | all the things that classical AI does best and he means
               | he wants to do them with neural nets, and of course he'll
               | fail.
        
               | og_kalu wrote:
               | That was a figure of speech. I didn't literally mean
               | games (not that GOFAI performs better than NNs in those
               | games anyway). I simply went off your own examples -
               | Vision, Image generation, Translation etc.
               | 
               | >As I say in my comment above planning and scheduling,
               | SAT, constraints, verification, theorem proving- those
               | are still dominated by classical systems
               | 
               | You can use NNs for all these things. It wouldn't make a
               | lot of sense because GOFAI would be perfect and the
               | former would be inefficient but you certainly could which
               | is again more than I can say for GOFAI and the domains
               | you listed.
        
               | YeGoblynQueenne wrote:
               | I don't understand your comment. Clarify.
               | 
               | As it is, your comment seems to tell me that neural nets
               | are good at neural net things and GOFAI is good at GOFAI
               | things, which is obvious, and is what I'm saying: neural
               | nets can make only very limited use of expert knowledge
               | and so suck in all domains where domain knowledge is
               | abundant and abundantly useful, which are the same
               | domains where GOFAI dominates. GOFAI can make very good
               | use of expert knowledge but is traditionally not as good
               | in domains where only tacit knowledge is available,
               | because we don't understand the domain well enough yet,
               | like in anything to do with pattern recognition, which is
               | the same domains where neural nets dominate. If explicit,
               | expert knowledge was available for those domains, then
               | GOFAI would dominate, and neural nets would fall behind,
               | completely contrary to what Sutton thinks.
               | 
               | So, the bitter lesson is only bitter for those who are
               | not interested in what classical AI systems can do best.
               | For those of us who are, the lesson is sweet indeed:
               | we're making progress, algorithmic progress, progress in
               | understanding, scientific progress, and don't need to
               | burn through thousands of credit to train on server farms
               | to do anything of note. That's even a running joke in my
               | team: hey, do you need any server time? Nah, I'll run the
               | experiment on my laptop over lunch. And then beat the RL
               | algo (PPO) that needs three days training on GPUs. To
               | solve mazes badly.
        
               | og_kalu wrote:
               | NNs can do the things GOFAI is good at a whole lot better
               | than GOFAI can do the things NNs are good at.
        
               | YeGoblynQueenne wrote:
               | That's wishful thinking not supported by empirical
               | results.
        
               | YeGoblynQueenne wrote:
               | Addendum:
               | 
               | >> Do you realize how much more data and compute it would
               | take to train a Vanilla RNN to say GPT-3 level
               | performance?
               | 
               | Oh, good point. And what would GPT-3 do with the typical
               | amount of data used to train an LSTM? Rhetorical.
        
               | adw wrote:
               | > And MCTS still the SOTA search algo in Go etc
               | 
               | It's often forgotten that Rich Sutton said the two things
               | which work are learning (the AlphaGo/Leela Zero policy
               | network) and search (MCTS). (I think the most interesting
               | research in ML is around the circumstances in which large
               | models wind up performing implicit search.)
        
               | adw wrote:
               | Yeah, all of those architectures are _themselves_ hacks
               | to get around having insufficient compute! They
               | absolutely were encoding inductive biases into the
               | network to get around not being able to train enough, and
               | transformers (handwaving hard enough to levitate, the
               | currently-trainable model family with the least inductive
               | bias) have eaten the world in all domains.
               | 
               | This is evidence _for_ the Bitter Lesson, not against it.
        
               | YeGoblynQueenne wrote:
               | They haven't (eaten the world etc). They just happen to
               | be the models that trend hard right now. I bet if you
               | could compare like for like you'd be able to see _some_
               | improvement in performance from Transformers, but that
               | 'd be extremely hard to separate from the expected
               | improvement from the constantly increasing amounts of
               | data and compute. For example, you could, today, train a
               | much bigger and deeper Multi-Layered Perceptron than you
               | could thirty years ago, but nodoy is trying because
               | that's so 1990's, and in any case they have the data and
               | compute to train much bigger, much more inefficient
               | (contrary to what you say if I got that right)
               | architectures.
               | 
               | Wait a few years and the Next Big Thing in AI will come
               | along, hot on the heels of the next generation of GPUs,
               | or tensor units or whatever the hardware industry can
               | cook up to sell shovels for the gold rush. By then,
               | Transfomers will have hit the plateau of diminishing
               | returns, there'll be gold in them there other hills and
               | nobody would talk of LLMs anymore because that's so
               | 2020s. We've been there so many times before.
        
               | adw wrote:
               | > much more inefficient
               | 
               | The tricky part here is that "efficiency" is not a single
               | dimension! Transformers are much more "efficient" in one
               | sense, in that they appear to be able to absorb much more
               | data before they saturate; they're in general less
               | computationally efficient in that you can't exploit
               | symmetries as hard, for example, at implementation time.
               | 
               | Let's talk about that in terms of a concrete example: the
               | big inductive bias of CNNs for vision problems is that
               | CNNs essentially presuppose that the model should be
               | translation-invariant. This works great -- speeds up
               | training and makes it more stable - until it doesn't and
               | that inductive bias starts limiting your performance,
               | which is in the large-data limit.
               | 
               | Fully-connected NNs are more general than transformers,
               | but they have _so many_ degrees of freedom that the
               | numerical optimization problem is impractical. If someone
               | figures out how to stabilize that training and make these
               | implementable on current or future hardware, you're
               | absolutely right that you'll see people use them. I don't
               | think transformers are magic; you're entirely correct in
               | saying that they're the current knee on the
               | implementability/trainability curve, and that can easily
               | shift given different unit economics.
               | 
               | I think one of the fundamental disconnects here is that
               | people who come at AI from the perspective of logic down
               | think of things very differently to people like me who
               | come at it from thermodynamics _up_.
               | 
               | Modern machine learning is just "applications of maximum
               | entropy", and to someone with a thermodynamics
               | background, that's intuitively obvious (not necessarily
               | correct! just obvious) -in a meaningful sense the
               | _universe_ is a process of gradient descent, so "of
               | course" the answer for some local domain models is
               | maximum-entropy too. In that world view, the higher-order
               | structure is _entirely emergent_. I'm, by training, a
               | crystallographer, so the idea that you can get highly
               | regular structure emerging from merciless application of
               | a single principle is just baked into my worldview very
               | deeply.
               | 
               | Someone who comes at things from the perspective of
               | mathematical logic is going to find that worldview very
               | weird, I suspect.
        
               | adw wrote:
               | > They haven't (eaten the world etc).
               | 
               | To clarify what I mean on this specific bit: the SOTA
               | results in 2D and 3D vision, audio, translation, NLP, etc
               | are all transformers. Past results do not necessarily
               | predict future performance, and it would be absurd to
               | claim that an immutable state of affairs, but it's
               | certainly interesting that all of the domain-specific
               | architectures have been flattened in a very short period
               | of time.
        
               | adw wrote:
               | > The language of continuous functions is alien to most
               | humans and it's very difficult to encode one's intuitive,
               | common sense knowledge into a system using that language
               | 
               | In other words; machine learned models are octopus brains
               | (https://www.scientificamerican.com/article/the-mind-of-
               | an-oc...) and that creeps you out. Fair enough, it creeps
               | me out too, and we should honour our emotions -- I'm no
               | rationalist - but we should also be aware of the risks of
               | confusing our emotional responses with reality.
        
               | gwern wrote:
               | Vapnik: https://www.cs.princeton.edu/courses/archive/spri
               | ng13/cos511... https://engineering.columbia.edu/files/eng
               | ineering/vapnik.pd...
               | https://www.learningtheory.org/learning-has-just-started-
               | an-... https://nautil.us/teaching-me-softly-234576/
               | 
               | The main paper: https://gwern.net/doc/reinforcement-
               | learning/exploration/act...
               | 
               | It sounds kinda crazy (is there really that much far
               | transfer?), but you know, I think it would work... He
               | just needed to use LLMs instead:
               | https://arxiv.org/abs/2309.10668#deepmind
        
             | talldayo wrote:
             | I find myself not wanting to agree with you, but deep down
             | I think you're right.
             | 
             | AI greatly reminds me of the Library of Babel thought
             | experiment. If we can imagine a library with every book
             | that can possibly be written in any language, would it
             | contain all human knowledge lost in a sea of noise? Is
             | there merit or value in creating a system that sifts
             | through such a library to attune hidden truths, or are we
             | dooming ourselves to finding meaning in nothingness?
             | 
             | In a certain sense, there's immense value to developing
             | concepts and ideas through intuition and thought. In
             | another sense, a rose by any other name smells just as
             | sweet; if an AI creates a perpetual motion device before a
             | human does, that's not nothing. I don't _expect_ AI to
             | speed past human capability like some people do, but it 's
             | certainly displaced a lot of traditional computer-vision
             | and text generation applications.
        
             | lisper wrote:
             | > then the statistical models get better
             | 
             | Maybe. The statistical models are definitely better at
             | natural language processing now, but they still fail on
             | analytical tasks.
             | 
             | Of course, human brains are statistical models, so there's
             | an existence proof that a sufficiently large statistical
             | model is, well, sufficient. But that doesn't mean that you
             | couldn't do better with an intelligently designed co-
             | processor. Even humans do better with a pocket calculator,
             | or even a sheet of paper, than they do with their unaided
             | brains.
        
               | YeGoblynQueenne wrote:
               | If human brains are statistical models, why are human
               | brains so bad at statistics?
               | 
               | Edt: btw, same for probabilistic inference, same for
               | logical inference, and same for any other thing anyone's
               | tried as the one true path to AI since the 1950's. Humans
               | have consistently proven bad at everything computers are
               | good at, and that tells us nothing about why humans are
               | good at anything (if, indeed, we are). Let's not assume
               | too much about brains until we find the blueprint, eh?
        
               | lisper wrote:
               | > why are human brains so bad at statistics?
               | 
               | That depends on what you mean by being "bad at
               | statistics." What brains do on a conscious level is very
               | different than what they do at a neurobiological level.
               | Brains are "bad at statistics" on the conscious level,
               | but at the level of neurobiology that's all they do.
               | 
               | As an analogy, consider a professional tennis or baseball
               | player. At the neurobiological level those people are
               | extremely good at finding solutions to kinematic
               | equations, but that doesn't mean that they would ace a
               | physics test.
        
               | YeGoblynQueenne wrote:
               | That is a very big assumption -that brains have conscious
               | and subconscious levels that are good and bad at
               | different things- that needs to be itself proved, before
               | it can be used to support any other line of inquiry.
               | 
               | I'm not well versed in the relevant literature at all but
               | my understanding is that research in the area points to
               | the completely opposite direction: that humans e.g.
               | playing baseball do not find solutions to kinematic
               | equations, but instead use simple heuristics that exploit
               | our senses and body configuration, like placing their
               | hands in front of their eyes so that they line up with
               | the ball etc.
               | 
               | This makes a lot more sense, not only for humans playing
               | tennis, but for animals surviving in the wild, finding
               | sustenance and shelter, and mates, while avoiding
               | becoming a meal. Consider the Portia spider [1], a
               | spider-hunting spider, itself prey to other hunting
               | spiders, with a brain consisting of a few tens of
               | thousands of neurons and still perfectly capable not only
               | of navigating complex environments in all three space
               | dimensions but also making complex plans involving
               | detours.
               | 
               | Just think of how quickly a spider must be able to think
               | that hunts, and is hunted by other spiders -some of the
               | most deadly predators in the animal kingdom. There is no
               | chance of a snowball in hell that such an animal has the
               | time to solve kinematic equations with a few KBs of
               | neurons. Absolutely no chance at all.
               | 
               | For that and many other stuff like that it looks very
               | unlikely to me that human brains, or any brains, are like
               | you say. In any case, that sounds positively Freudian and
               | I don't mean that as an insult, but I so could.
               | 
               | ______________
               | 
               | [1] My favourite. No, I don't mean meal. I just love this
               | paper; it's almost the best paper in autonomous robotics
               | and planning that I've ever read:
               | 
               | https://www.frontiersin.org/journals/psychology/articles/
               | 10....
        
               | lisper wrote:
               | > That is a very big assumption -that brains have
               | conscious and subconscious levels that are good and bad
               | at different things- that needs to be itself proved,
               | before it can be used to support any other line of
               | inquiry.
               | 
               | You can't be serious. Do you really doubt that hand-eye
               | coordination and solving systems of kinematic equations
               | on paper using math are disjoint skills? That one can be
               | good at one without being good at the other? That there
               | is in actual fact an inverse correlation between these
               | skills? How do you account for the fact that even people
               | who have never studied math or physics can learn to throw
               | and catch a ball?
        
               | YeGoblynQueenne wrote:
               | ... because they don't need to use maths or physics?
               | 
               | And yes, I'm serious. Can you please be less
               | confrontational?
        
               | lisper wrote:
               | Sorry about that, I'm dealing with a troll on another
               | thread so I'm on a bit of a hair trigger.
               | 
               | I think we have a fundamental disconnect somewhere, so
               | let's try to diagnose it. Where do you start to disagree
               | in the following series of claims:
               | 
               | 1. People can have kinematic skills, like throwing and
               | catching balls, without having math or physics skills,
               | like solving kinematic equations.
               | 
               | 2. In order to have kinematic skills, _something_ in your
               | brain must be doing something that can be equated by some
               | mapping to solving kinematic equations, because the
               | actions that your muscles perform when performing
               | kinematic skills _are_ the solutions to kinematic
               | equations, so your brain must be producing those (things
               | that map to) solutions somehow.
               | 
               | 3. As far as we can tell, brains don't operate
               | symbolically at the neurobiological level. Individual
               | neurons operate according to laws having to do with
               | electrical impulses, synapse firings, neurotransmitters,
               | etc. none of which have anything to do with kinematics.
               | 
               | 4. People with kinematic skills generally have only
               | limited insight into how they do what they do when they
               | apply those skills. Being able to catch a ball doesn't by
               | itself give you enough insight to be able to describe to
               | someone how to build a machine that would catch a ball.
               | But someone with math and physics and engineering skills
               | but no kinematic skills (your streotypical geek) could
               | plausibly build a machine that could catch a ball much
               | better than they themselves could. But the workings of a
               | machine built using knowledge of math would almost
               | certainly operate in a very different manner than the
               | brain of a human with kinematic skills.
               | 
               | I think I'll stop there and ask if there is anything you
               | disagree with so far.
        
               | avmich wrote:
               | It's great to read conversation of towering HN experts in
               | the field.
               | 
               | Lisper, as I understand this part -
               | 
               | > In order to have kinematic skills, something in your
               | brain must be doing something that can be equated by some
               | mapping to solving kinematic equations
               | 
               | you're talking about an equivalent of YeGoblynQueenne's
               | 
               | > that humans ... do not find solutions to kinematic
               | equations, but instead use simple heuristics that exploit
               | our senses and body configuration, like placing their
               | hands in front of their eyes so that they line up with
               | the ball
               | 
               | So to me the question is, is it correct? Can "mapping to
               | solve kinematic equation" be the same as "simple
               | heuristic... like placing hands in from of eyes"?
               | 
               | Physically this equivalence seems at least plausible.
               | 
               | Now, about
               | 
               | > neurons operate according to laws having to do with
               | electrical impulses
               | 
               | - can't we have those kinematic equations solving, or, in
               | other words, applying simple heuristics, as a trained
               | combination of such neuronal activity?
        
               | mistermann wrote:
               | > That is a very big assumption -that brains have
               | conscious and subconscious levels that are good and bad
               | at different things- that needs to be itself proved,
               | _before it can be used to support any other line of
               | inquiry_.
               | 
               | Does this assumption itself need to be proven?
               | 
               | Besides, it's not true: you can simply define it as an
               | assumption within a thought experiment and proceed
               | merrily along, or you can just not bother to consider
               | whether one's premises are true in the first place, and
               | proceed merrily along.
               | 
               | The second option tends to be more popular in my
               | experience, perhaps because it is so much easier, and
               | perhaps for some other reasons also.
        
       | admsmz wrote:
       | For a moment there I thought it was talking about this high
       | school project, https://www.eurisko.us/
        
         | ngvrnd wrote:
         | No, the high school project is talking about this.
        
         | monero-xmr wrote:
         | And the classic X-Files episode "Ghost in the Machine"
         | https://en.wikipedia.org/wiki/Ghost_in_the_Machine_(The_X-Fi...
        
       | dang wrote:
       | Related. Others?
       | 
       |  _Doug Lenat 's sources for AM (and EURISKO+Traveller?) found in
       | public archives_ - https://news.ycombinator.com/item?id=38413615
       | - Nov 2023 (9 comments)
       | 
       |  _Eurisko Automated Discovery System_ -
       | https://news.ycombinator.com/item?id=37355133 - Sept 2023 (1
       | comment)
       | 
       |  _Why AM and Eurisko Appear to Work (1983) [pdf]_ -
       | https://news.ycombinator.com/item?id=28343118 - Aug 2021 (17
       | comments)
       | 
       |  _Early AI: "Eurisko, the Computer with a Mind of Its Own"
       | (1984)_ - https://news.ycombinator.com/item?id=27298167 - May
       | 2021 (2 comments)
       | 
       |  _Some documents on AM and EURISKO_ -
       | https://news.ycombinator.com/item?id=18443607 - Nov 2018 (10
       | comments)
       | 
       |  _Why AM and Eurisko Appear to Work (1983) [pdf]_ -
       | https://news.ycombinator.com/item?id=9750349 - June 2015 (5
       | comments)
       | 
       |  _Why AM and Eurisko Appear to Work (1984) [pdf]_ -
       | https://news.ycombinator.com/item?id=8219681 - Aug 2014 (2
       | comments)
       | 
       |  _Eurisko, The Computer With A Mind Of Its Own_ -
       | https://news.ycombinator.com/item?id=2111826 - Jan 2011 (9
       | comments)
       | 
       |  _Let 's reimplement Eurisko_ -
       | https://news.ycombinator.com/item?id=656380 - June 2009 (25
       | comments)
       | 
       |  _Eurisko, The Computer With A Mind Of Its Own_ -
       | https://news.ycombinator.com/item?id=396796 - Dec 2008 (13
       | comments)
        
       | slavboj wrote:
       | EURISKO is basically a series of genetic algorithms over lisp
       | code - the homoiconic nature of lisp making it effectively a
       | meta-optimizer. Amongst many problems was that the solution
       | space, even for things like "be interesting and true", was way
       | too large.
        
       | nxobject wrote:
       | Random funfact I didn't anticipate learning: Eurisko ran on Altos
       | as well. Talk about a resource-constrained environment...
        
       ___________________________________________________________________
       (page generated 2024-04-23 23:00 UTC)