[HN Gopher] EURISKO Lives
___________________________________________________________________
EURISKO Lives
Author : wodow
Score : 112 points
Date : 2024-04-23 03:50 UTC (19 hours ago)
(HTM) web link (blog.funcall.org)
(TXT) w3m dump (blog.funcall.org)
| whartung wrote:
| The confluence of happenstance that occurs to make this a reality
| is pretty amazing to witness.
|
| Unfortunately it starts with the passing of Douglas Lenat. But
| that enabled Stanford to open up their 40 year old archive, which
| they still had, of Lenats work.
|
| Somehow, someway, someone not only stumbled upon EURISKO, but
| also knew what it was. One of the most notorious AI research
| projects of the age that actually broke out of the research labs
| of Stanford and out into the public eye, with impactful results.
| Granted, for arguably small values of "public" and "impactful",
| but for the small community it affected, it made a big splash.
|
| Lenat used EURISKO to find a very unconventional winning
| configuration to go on to win a national gaming tournament.
| Twice.
|
| In that community, it was a big deal. The publisher changed the
| rules because of it, but Lenat returned victorious again the next
| year. After a discussion with the game and tournament sponsors,
| he never came back.
|
| Apparently EURISKO has quite a reputation in the symbolic AI
| world, but even there it was held close.
|
| But now it has been made available. Not only made available, but
| made operational. EURISKO is written in an obsolete Lisp dialect,
| Interlisp. But, coincidentally, we have today machine simulators
| that can run versions of that Lisp on long lost, 40 year
| machines.
|
| And someone was able to port it. And it seems to run.
|
| The thought of the tendrils through time that had to twist their
| way for us to get here leaves, at least me, awestruck. So much
| opportunity for the wrong butterfly to have been stepped on to
| prevent this from happening.
|
| But it didn't, and here we are. Great job by the spelunkers who
| dug this up.
| jsnell wrote:
| Enough of the Traveller tournament story is dodgy and
| inconsistent enough that it's very hard to say what actually
| happened beyond Lenat winning the tournament twice in a row
| with some kind of computer assistance,
|
| Basically, with the Traveller tournament Lenat appears to have
| stumbled onto a story that caught the public's imagination, and
| then through the milked it for all he could to give his project
| publicity and to make it appear more successful than it
| actually was. And if that required embellishing the story or
| just making shit up, well, no harm no foul.
|
| Even when something is technically true, it often turns out
| that it's being told in a misleading way. For example, you say
| that "the publisher changed the ruleset". That was the entire
| gimmick of the Traveller TCS tournament rules! The printed
| rulebook had a preset progression of tournament rules for each
| year.
|
| I wrote a bit more about this a few years ago with some of the
| other details: https://news.ycombinator.com/item?id=28344379
| peheje wrote:
| What is it?
| emmelaich wrote:
| _Eurisko (Gr., I discover) is a discovery system written by
| Douglas Lenat in RLL-1, a representation language itself
| written in the Lisp programming language. A sequel to Automated
| Mathematician, it consists of heuristics, i.e. rules of thumb,
| including heuristics describing how to use and change its own
| heuristics_
|
| - https://en.wikipedia.org/wiki/Eurisko
| KineticLensman wrote:
| It got a lot of kudos for winning a multi player naval
| wargame by building a bizarre but successful fleet that
| exploited all the loopholes and quirks in the rules.
| actionfromafar wrote:
| Didn't it build a swarm of tiny boats? That loophole seems
| to be currently exploited in the real world, too.
| KineticLensman wrote:
| IIRC it had at least one small (?) purely defensive boat
| that couldn't be destroyed by typical weapons so its
| parent fleet couldn't be defeated. It wasn't like a
| modern drone swarm
| PaulHoule wrote:
| It makes me think of the battles in Doc Smith's _Lensman_
| series where the Galactic Patrol would develop a game-
| breaking fleet formation to use against Boskone in every
| major naval battle.
| KineticLensman wrote:
| Ah yes, probably need to re-read them. I remember how
| virtually every book introduces a new super weapon that
| becomes part of the standard arsenal in the next, all the
| way up to entire 'negative matter' planets that are fired
| out of subspace (can't recall the in-universe name) at
| the planets hosting your opponent's base.
| PaulHoule wrote:
| These are all on the Canadian Gutenberg.
| pfdietz wrote:
| Traveller Trillion Credit Squadron
|
| Traveller was (and is) a space-based RPG, although the
| original publisher is long out of business.
|
| https://en.wikipedia.org/wiki/Traveller_(role-playing_game)
| pfdietz wrote:
| TCS itself: https://www.mongoosepublishing.com/products/a
| dventure-5-tril...
| boffinAudio wrote:
| Can anyone give a clear example of how this can be used
| productively? Its description doesn't help much.
|
| What can one _do_ with EURISKO? The fact of its recovery
| after its authors passing is interesting, in and of itself -
| but why is EURISKO, specifically, worth the effort of
| understanding?
| pamoroso wrote:
| This post also explains what EURISKO does by putting it
| into the context of the later system Doug Lenat worked on,
| Cyc https://outsiderart.substack.com/p/cyc-historys-
| forgotten-ai...
| isaacfrond wrote:
| Excellent question. I have no clue either.
| downvotetruth wrote:
| dupe https://news.ycombinator.com/item?id=40095121
|
| 4 months ago: https://news.ycombinator.com/item?id=38413615
| varjag wrote:
| I shall correct belatedly, the heuristic I point at after IsA is
| not in fact not-IsA. Also, the system runs out of stack space not
| of heap space.
| thom wrote:
| Up until about GPT 2, EURISKO was arguably the most interesting
| achievement in AI. Back in the day on the SL4 and singularitarian
| mailing lists, it was spoken of in reverent tones, and I'm sure I
| remember a much younger Eliezer Yudkowsky cautioning that Doug
| Lenat should have perceived a non-zero chance of hard takeoff at
| the moment of its birth. I suspect its achievements were slightly
| overblown and heavily guided by a human hand, but it's still
| fascinating and definitely worthy of study. Genetic programming
| hasn't yielded many interesting results since, and the
| unreasonably effectiveness of differentiable programming and
| backpropagation has sucked up much of the oxygen in the room. But
| not everything is differentiable, the combination of the two
| still seems worth investigating, and EURISKO goes to show the
| power of heuristic approaches to some problems.
| cabalamat wrote:
| > Up until about GPT 2, EURISKO was arguably the most
| interesting achievement in AI.
|
| I agree.
|
| > I suspect its achievements were slightly overblown and
| heavily guided by a human hand
|
| So do I. We'll find out how much of its performance was real,
| and how much bullshit.
|
| > the unreasonably effectiveness of differentiable programming
| and backpropagation has sucked up much of the oxygen in the
| room
|
| The Bitter Lesson --
| http://www.incompleteideas.net/IncIdeas/BitterLesson.html
| Animats wrote:
| Not really. Read [1], which references "Why AM and Eurisko
| appear to work". There's a reason that line of development did
| not continue.
|
| [1] https://news.ycombinator.com/item?id=28343118
| radomir_cernoch wrote:
| > Up until about GPT 2, EURISKO was arguably the most
| interesting achievement in AI.
|
| I'm really baffled by such statement and genuinely curious.
|
| How come that studying GOFAI as undergraduate and graduate at
| many European universities, doing a PhD. and working in the
| field for several years _never_ exposed me to EURISKO up until
| last week (thanks to HN)?
|
| I heard about Cyc, many formalism and algorithms that related
| to EURISKO, but never heard of its name.
|
| Is EURISKO famous in US only?
| radomir_cernoch wrote:
| For that reason, a comparison between GPT 2 and EURISKO seems
| funny to me.
|
| I discussed ChatGPT with my yoga teacher recently, but I bet
| not even my IT colleagues would have a clue about EURISKO.
| :-)
| Phiwise_ wrote:
| So? There's a real possibility DART has still saved its
| customers more money over its lifetime than GPT has, and
| odds are basically 100% that your yoga teacher and IT
| colleagues haven't heard a thing about it either. The
| general public has all sorts of wrong impressions and
| unknown unknowns of facts that I don't see why they should
| ever be used as a technology industry benchmark by anyone
| not working in the UI department of a smartphone vendor.
| rjsw wrote:
| > Is EURISKO famous in US only?
|
| It was featured in a BBC radio series on AI made by Colin
| Blakemore [1] around 1980, the papers on AM and EURISKO were
| in the library of the UK university that I attended.
|
| [1] https://en.wikipedia.org/wiki/Colin_Blakemore#Public_enga
| gem...
| craigus wrote:
| "... I'm sure I remember a much younger Eliezer Yudkowsky
| cautioning that Doug Lenat should have perceived a non-zero
| chance of hard takeoff at the moment of its birth."
|
| https://www.lesswrong.com/posts/rJLviHqJMTy8WQkow/recursion-...
| cabalamat wrote:
| Also, in 2009 someone suggested re-implementing Eurisko[1],
| and Yudkowsky cautioned against it:
|
| > This is a road that does not lead to Friendly AI, only to
| AGI. I doubt this has anything to do with Lenat's motives -
| but I'm glad the source code isn't published and I don't
| think you'd be doing a service to the human species by trying
| to reimplement it.
|
| To my mind -- and maybe this is just the benefit of hindsight
| -- this seems way too overcautious on Yudkowsky's part.
|
| [1]: https://www.lesswrong.com/posts/t47TeAbBYxYgqDGQT/let-s-
| reim...
| randallsquared wrote:
| Machinery can be a lot simpler than biology. Birds are
| incredibly complex systems: wing structure, musculature,
| feathers, etc. An airplane can be a vaguely wing-shaped
| piece of metal and a pulse jet. It doesn't seem super
| implausible that there is some algorithm that is to human
| consciousness what a pulse jet with wings is to a bird.
| Maybe LLMs are that, but maybe they're far more than is
| really needed because we don't yet know what we are doing.
|
| I would bet against it being possible to implement
| consciousness on a PDP, but I wouldn't be very confident
| about it.
| api wrote:
| > a much younger Eliezer Yudkowsky cautioning that Doug Lenat
| should have perceived a non-zero chance of hard takeoff at the
| moment of its birth
|
| Why is Yudkowsky taken seriously? This stuff is comparable to
| the "LHC micro black holes will destroy Earth" hysteria.
|
| There are actual concerns around AI like deep fakes, a deluge
| of un-filterable spam, mass manipulation via industrial scale
| propaganda, mass unemployment created by widespread automation
| leading to civil unrest, opaque AIs making judgements that
| can't be evaluated properly, AI as a means of mass
| appropriation of work and copyright violation, concentration of
| power in large AI companies, etc. The crackpot "hard takeoff"
| hysteria only distracts from reasonable discourse about these
| risks and how to mitigate them.
| TeMPOraL wrote:
| > _Why is Yudkowsky taken seriously?_
| Trivialities Annoyances Immediate harm X-Risk
| |------------------------------------------------------|
| \----stuff you mention-------/
| \---stuff Eliezer------/
| wrote about
|
| > _The crackpot "hard takeoff" hysteria only distracts from
| reasonable discourse about these risks and how to mitigate
| them._
|
| IDK, I feel endless hand-wringing about copyright and
| deepfakes distract from risks of actual, significant harm at
| scale, some of which you also mentioned.
| thom wrote:
| Perhaps we can disagree on the shape of the curve, but it
| seems likely that ever more capable AI will enable ever more
| serious harms. Absolutely true that we should counter those
| harms in the present and not fixate on a theoretical future,
| but the medicine is much the same either way.
| adw wrote:
| > Why is Yudkowsky taken seriously?
|
| People like religion, particularly if it doesn't affect how
| they live their life _today_ too much. You get all of the
| emotional benefits of feeling like you're doing something
| virtuous without the effort of actually performing good
| works.
| rvba wrote:
| > "LHC micro black holes will destroy Earth" hysteria.
|
| I will be heavily downvoted for this, but here is how I
| remember it:
|
| 1) LHC was used to study blackholes and prove things like
| Hawking radiation
|
| 2) LHC was supposed to be safe due to Hawking radiation (that
| was only an unproven theory at the time)
|
| So the unpopular question: what if Hawking radiation didnt
| actually exist? Wouldnt there be a risk of us dying? A small
| risk, but still some risk? (especially as the potential micro
| black hole would have the same velocity as earth, so it
| wouldnt fly away somewhere into space)
|
| On a side note: how would EURISCO evaluate this topic?
|
| Since I read about this secretive CYC (why u can email asking
| for it, but source not hosted anywhere?): couldnt any current
| statistics based AI be used to feed this CYC program /
| database with information? Take a dictionary and ask ChatGPT
| to fill it with information for each word.
| api wrote:
| The fundamental reason that hysteria was silly is that
| Earth is bombarded by cosmic rays that are far stronger
| than anything done in the LHC. The reason we built the LHC
| is so we can do observable repeatable experiments at high
| energies, not to reach energies never reached on Earth
| before.
|
| The AI hysteria I'm talking about here is the "foom"
| hysteria, the idea that a sufficiently powerful model will
| start self-improving without bound and become some kind of
| AI super-god. That's about as wild as the LHC will make a
| black hole that will implode the Earth. There are
| fundamental reasons to believe it's impossible, such as the
| question of "where would the information come from to drive
| that runaway intelligence explosion?"
|
| There are legitimate risks with AI, but not because AI is
| somehow special and magical. All technologies have risks.
| If you make a sharper stick, someone will stab someone with
| it. Someday we may make a stick so sharp it stabs the
| entire world (cue 50s sci-fi theremin music).
|
| Edit: for example... I would argue that the Internet itself
| has X-risks. The Internet creates an environment that
| incentivizes an arms race for attention grabbing, and the
| most effective strategies usually rely on triggering
| negative emotions and increasing division. This could run
| away to the point that it drives, say, civilizational
| collapse or a global thermonuclear war. Does this mean it
| would have been right to ban the Internet or require strict
| licensing to place any new system online?
| lisper wrote:
| > the combination of the two still seems worth investigating
|
| This.
|
| Back in the late 1980's and early 90's the debate-du-jour was
| between deliberative and reactive control systems for robots. I
| got my Ph.D. for simply saying that the entire debate was based
| on the false premise that it had to be one or the other, that
| each approach had its strengths and weaknesses, and that if you
| just put the two together the whole would be greater than the
| sum of its parts. (Well, it was a little more than that. I had
| to actually show that it worked, which was more work that
| simply advancing the hypothesis, but in retrospect it seems
| kinda obvious, doesn't it?)
|
| If I were still in the game today, combining generative-AI and
| old-school symbolic reasoning (which has also advanced a lot in
| 30 years) would be the first thing I would focus my attention
| (!) on.
| adw wrote:
| People have advanced that argument a lot, and it's often
| worked for a short while; then the statistical models get
| better.
|
| Chess was a game for humans.
|
| It was very briefly a game for humans and machines (Kasparov
| had a go at getting "Advanced Chess" off the ground as a
| competitive sport), but soon enough having a human in the
| team made the program worse.
|
| But at least the evaluation functions were designed by
| humans, right? That lasted a remarkably long time; first
| Stockfish became the strongest engine in the world by using
| distributed hyperparameter search to tweak its piece-square
| tables, then AlphaZero came along and used a policy network +
| MCTS instead of alpha-beta search, then (with an assist from
| the Shogi community) Stockfish struck back with a completely
| learned evaluation function via NNUE.
|
| So the last frontier of human expertise in chess is search
| heuristics, and that's going to fall too:
| https://arxiv.org/abs/2402.04494.
|
| The common theme with all of this is that the stuff which we
| used before are, fundamentally, hacks to get around _not
| having enough compute_, but which make the system worse once
| you don't have to make those tradeoffs around inductive
| biases. Empirical evidence suggests that raw scaling has a
| long way to run yet.
| YeGoblynQueenne wrote:
| That's the "bitter lesson", right? Which is really a sour
| lesson- as in sour grapes. See, Rich Sutton's point with
| his Bitter Lesson is that encoding expert knowledge only
| improves performance temporarily, which is eventually
| surpassed by more data and compute.
|
| There are only two problems with this: One, statistical
| machine learning systems have an extremely limited ability
| to encode expert knowledge. The language of continuous
| functions is alien to most humans and it's very difficult
| to encode one's intuitive, common sense knowledge into a
| system using that language [1]. That's what I mean when I
| say "sour grapes". Statistical machine learning folks can't
| use expert knowledge very well, so they pretend it's not
| needed.
|
| Two, all the loud successes of statistical machine learning
| in the last couple of decades are closely tied to minutely
| specialised neural net architectures: CNNs for image
| classification, LSTMs for translation, Transformers for
| language, Difussion models and Ganns for image generation.
| If that's not encoding knowledge of a domain, what is?
|
| Three, because of course three, despite point number two,
| performance keeps increasing only as data and compute
| increases. That's because the minutely specialised
| architectures in point number two are inefficient as all
| hell; the result of not having a good way to encode expert
| knowledge. Statistical machine learning folk make a virtue
| out of necessity and pretend that only being able to
| increase performance by increasing resources is some kind
| of achievement, whereas it's exactly the opposite: it is a
| clear demonstration that the capabilities of systems are
| not improving [2]. If capabilities were improving, we
| should see the number of examples required to train a
| state-of-the-art system either staying the same, or going
| down. Well, it ain't.
|
| Of course the neural net [community] will complain that
| their systems have reached heights never before seen in
| classical AI, but that's an argument that can only be
| sustained by the ignorance of the continued progress in all
| the classical AI subjects such as planning and scheduling,
| SAT solving, verification, automated theorem proving and so
| on.
|
| For example, and since planning is high on my priorities
| these days, see this video where the latest achievements in
| planning are discussed (from 2017).
|
| https://youtu.be/g3lc8BxTPiU?si=LjoFITSI5sfRFjZI
|
| See particularly around this point where he starts talking
| about the Rollout IW(1) symbolic planning algorithm that
| plays Atari from screen pixels with performance comparable
| to Deep-RL; except it does so _online_ (i.e. no training,
| just reasoning on the fly):
|
| https://youtu.be/g3lc8BxTPiU?si=33XSM6yK9hOlZJnf&t=1387
|
| Bitter lesson my sweet little ass.
|
| ____________
|
| [1] Gotta find where this paper was but none other than
| Vladimir Vapnik basically demonstrated this by trying the
| maddest experiment I've ever seen in machine learning:
| using poetry to improve a vision classifier. It didn't
| work. He's spent the last 20 years trying to find a good
| way to encode human knowledge into continuous functions. It
| doesn't work.
|
| [2] In particular their capability for inductive
| generalisation which remains absolutely crap.
| og_kalu wrote:
| >Two, all the loud successes of statistical machine
| learning in the last couple of decades are closely tied
| to minutely specialised neural net architectures: CNNs
| for image classification, LSTMs for translation,
| Transformers for vision, Difussion models and Ganns for
| image generation. If that's not encoding knowledge of a
| domain, what is?
|
| Transformers, Diffusion for Vision, Image generation are
| really odd examples here. None of those architectures or
| training processes are tuned for Vision in mind lol. It
| was what? 3 years after Attention 2017 before the famous
| Vit paper. CNNs have lost a lot of favor to Vits, LSTMs
| are not the best performing translators today.
|
| The bitter lesson is that less encoding of "expert"
| knowledge results in better performance and this has
| absolutely held up. The "encoding of knowledge" you call
| these architectures is nowhere near that of the GOFAI
| kind and even more than that, less biased NN
| architectures seem to be winning out.
|
| >That's because the minutely specialised architectures in
| point number two are inefficient as all hell; the result
| of not having a good way to encode expert knowledge.
|
| Inefficient is a whole lot better than can't even play
| the game, the story of GOFAI for the last few decades.
|
| >If capabilities were improving, we should see the number
| of examples required to train a state-of-the-art system
| either staying the same, or going down. Well, they ain't.
|
| The capabilities of models are certainly increasing. Even
| your example is blatantly wrong. Do you realize how much
| more data and compute it would take to train a Vanilla
| RNN to say GPT-3 level performance?
| YeGoblynQueenne wrote:
| >> Inefficient is a whole lot better than can't even play
| the game, the story of GOFAI for the last few decades.
|
| See e.g. my link above where GOFAI plays the game (Atari)
| very well indeed.
|
| Also see Watson winning Jeopardy (a hybrid system, but
| mainly GOFAI - using frames and Prolog for knowledge
| extraction, encoding and retrieval).
|
| And Deep Blue beating Kasparov. And MCTS still the SOTA
| search algo in Go etc.
|
| And EURISCO playing Traveller as above.
|
| And Pluribus playing Poker with expert game-playing
| knowledge.
|
| And the recent neuro-symbolic DeepMind thingy that solves
| geometry problems from the maths olympiad.
|
| etc. etc. [Gonna stop editing and adding more as they
| come to my mind here.]
|
| And that's just playing games. As I say in my comment
| above planning and scheduling, SAT, constraints,
| verification, theorem proving- those are still dominated
| by classical systems and neural nets suck at them. Ask
| Yan LeCun: "Machine learning sucks". He means it sucks in
| all the things that classical AI does best and he means
| he wants to do them with neural nets, and of course he'll
| fail.
| og_kalu wrote:
| That was a figure of speech. I didn't literally mean
| games (not that GOFAI performs better than NNs in those
| games anyway). I simply went off your own examples -
| Vision, Image generation, Translation etc.
|
| >As I say in my comment above planning and scheduling,
| SAT, constraints, verification, theorem proving- those
| are still dominated by classical systems
|
| You can use NNs for all these things. It wouldn't make a
| lot of sense because GOFAI would be perfect and the
| former would be inefficient but you certainly could which
| is again more than I can say for GOFAI and the domains
| you listed.
| YeGoblynQueenne wrote:
| I don't understand your comment. Clarify.
|
| As it is, your comment seems to tell me that neural nets
| are good at neural net things and GOFAI is good at GOFAI
| things, which is obvious, and is what I'm saying: neural
| nets can make only very limited use of expert knowledge
| and so suck in all domains where domain knowledge is
| abundant and abundantly useful, which are the same
| domains where GOFAI dominates. GOFAI can make very good
| use of expert knowledge but is traditionally not as good
| in domains where only tacit knowledge is available,
| because we don't understand the domain well enough yet,
| like in anything to do with pattern recognition, which is
| the same domains where neural nets dominate. If explicit,
| expert knowledge was available for those domains, then
| GOFAI would dominate, and neural nets would fall behind,
| completely contrary to what Sutton thinks.
|
| So, the bitter lesson is only bitter for those who are
| not interested in what classical AI systems can do best.
| For those of us who are, the lesson is sweet indeed:
| we're making progress, algorithmic progress, progress in
| understanding, scientific progress, and don't need to
| burn through thousands of credit to train on server farms
| to do anything of note. That's even a running joke in my
| team: hey, do you need any server time? Nah, I'll run the
| experiment on my laptop over lunch. And then beat the RL
| algo (PPO) that needs three days training on GPUs. To
| solve mazes badly.
| og_kalu wrote:
| NNs can do the things GOFAI is good at a whole lot better
| than GOFAI can do the things NNs are good at.
| YeGoblynQueenne wrote:
| That's wishful thinking not supported by empirical
| results.
| YeGoblynQueenne wrote:
| Addendum:
|
| >> Do you realize how much more data and compute it would
| take to train a Vanilla RNN to say GPT-3 level
| performance?
|
| Oh, good point. And what would GPT-3 do with the typical
| amount of data used to train an LSTM? Rhetorical.
| adw wrote:
| > And MCTS still the SOTA search algo in Go etc
|
| It's often forgotten that Rich Sutton said the two things
| which work are learning (the AlphaGo/Leela Zero policy
| network) and search (MCTS). (I think the most interesting
| research in ML is around the circumstances in which large
| models wind up performing implicit search.)
| adw wrote:
| Yeah, all of those architectures are _themselves_ hacks
| to get around having insufficient compute! They
| absolutely were encoding inductive biases into the
| network to get around not being able to train enough, and
| transformers (handwaving hard enough to levitate, the
| currently-trainable model family with the least inductive
| bias) have eaten the world in all domains.
|
| This is evidence _for_ the Bitter Lesson, not against it.
| YeGoblynQueenne wrote:
| They haven't (eaten the world etc). They just happen to
| be the models that trend hard right now. I bet if you
| could compare like for like you'd be able to see _some_
| improvement in performance from Transformers, but that
| 'd be extremely hard to separate from the expected
| improvement from the constantly increasing amounts of
| data and compute. For example, you could, today, train a
| much bigger and deeper Multi-Layered Perceptron than you
| could thirty years ago, but nodoy is trying because
| that's so 1990's, and in any case they have the data and
| compute to train much bigger, much more inefficient
| (contrary to what you say if I got that right)
| architectures.
|
| Wait a few years and the Next Big Thing in AI will come
| along, hot on the heels of the next generation of GPUs,
| or tensor units or whatever the hardware industry can
| cook up to sell shovels for the gold rush. By then,
| Transfomers will have hit the plateau of diminishing
| returns, there'll be gold in them there other hills and
| nobody would talk of LLMs anymore because that's so
| 2020s. We've been there so many times before.
| adw wrote:
| > much more inefficient
|
| The tricky part here is that "efficiency" is not a single
| dimension! Transformers are much more "efficient" in one
| sense, in that they appear to be able to absorb much more
| data before they saturate; they're in general less
| computationally efficient in that you can't exploit
| symmetries as hard, for example, at implementation time.
|
| Let's talk about that in terms of a concrete example: the
| big inductive bias of CNNs for vision problems is that
| CNNs essentially presuppose that the model should be
| translation-invariant. This works great -- speeds up
| training and makes it more stable - until it doesn't and
| that inductive bias starts limiting your performance,
| which is in the large-data limit.
|
| Fully-connected NNs are more general than transformers,
| but they have _so many_ degrees of freedom that the
| numerical optimization problem is impractical. If someone
| figures out how to stabilize that training and make these
| implementable on current or future hardware, you're
| absolutely right that you'll see people use them. I don't
| think transformers are magic; you're entirely correct in
| saying that they're the current knee on the
| implementability/trainability curve, and that can easily
| shift given different unit economics.
|
| I think one of the fundamental disconnects here is that
| people who come at AI from the perspective of logic down
| think of things very differently to people like me who
| come at it from thermodynamics _up_.
|
| Modern machine learning is just "applications of maximum
| entropy", and to someone with a thermodynamics
| background, that's intuitively obvious (not necessarily
| correct! just obvious) -in a meaningful sense the
| _universe_ is a process of gradient descent, so "of
| course" the answer for some local domain models is
| maximum-entropy too. In that world view, the higher-order
| structure is _entirely emergent_. I'm, by training, a
| crystallographer, so the idea that you can get highly
| regular structure emerging from merciless application of
| a single principle is just baked into my worldview very
| deeply.
|
| Someone who comes at things from the perspective of
| mathematical logic is going to find that worldview very
| weird, I suspect.
| adw wrote:
| > They haven't (eaten the world etc).
|
| To clarify what I mean on this specific bit: the SOTA
| results in 2D and 3D vision, audio, translation, NLP, etc
| are all transformers. Past results do not necessarily
| predict future performance, and it would be absurd to
| claim that an immutable state of affairs, but it's
| certainly interesting that all of the domain-specific
| architectures have been flattened in a very short period
| of time.
| adw wrote:
| > The language of continuous functions is alien to most
| humans and it's very difficult to encode one's intuitive,
| common sense knowledge into a system using that language
|
| In other words; machine learned models are octopus brains
| (https://www.scientificamerican.com/article/the-mind-of-
| an-oc...) and that creeps you out. Fair enough, it creeps
| me out too, and we should honour our emotions -- I'm no
| rationalist - but we should also be aware of the risks of
| confusing our emotional responses with reality.
| gwern wrote:
| Vapnik: https://www.cs.princeton.edu/courses/archive/spri
| ng13/cos511... https://engineering.columbia.edu/files/eng
| ineering/vapnik.pd...
| https://www.learningtheory.org/learning-has-just-started-
| an-... https://nautil.us/teaching-me-softly-234576/
|
| The main paper: https://gwern.net/doc/reinforcement-
| learning/exploration/act...
|
| It sounds kinda crazy (is there really that much far
| transfer?), but you know, I think it would work... He
| just needed to use LLMs instead:
| https://arxiv.org/abs/2309.10668#deepmind
| talldayo wrote:
| I find myself not wanting to agree with you, but deep down
| I think you're right.
|
| AI greatly reminds me of the Library of Babel thought
| experiment. If we can imagine a library with every book
| that can possibly be written in any language, would it
| contain all human knowledge lost in a sea of noise? Is
| there merit or value in creating a system that sifts
| through such a library to attune hidden truths, or are we
| dooming ourselves to finding meaning in nothingness?
|
| In a certain sense, there's immense value to developing
| concepts and ideas through intuition and thought. In
| another sense, a rose by any other name smells just as
| sweet; if an AI creates a perpetual motion device before a
| human does, that's not nothing. I don't _expect_ AI to
| speed past human capability like some people do, but it 's
| certainly displaced a lot of traditional computer-vision
| and text generation applications.
| lisper wrote:
| > then the statistical models get better
|
| Maybe. The statistical models are definitely better at
| natural language processing now, but they still fail on
| analytical tasks.
|
| Of course, human brains are statistical models, so there's
| an existence proof that a sufficiently large statistical
| model is, well, sufficient. But that doesn't mean that you
| couldn't do better with an intelligently designed co-
| processor. Even humans do better with a pocket calculator,
| or even a sheet of paper, than they do with their unaided
| brains.
| YeGoblynQueenne wrote:
| If human brains are statistical models, why are human
| brains so bad at statistics?
|
| Edt: btw, same for probabilistic inference, same for
| logical inference, and same for any other thing anyone's
| tried as the one true path to AI since the 1950's. Humans
| have consistently proven bad at everything computers are
| good at, and that tells us nothing about why humans are
| good at anything (if, indeed, we are). Let's not assume
| too much about brains until we find the blueprint, eh?
| lisper wrote:
| > why are human brains so bad at statistics?
|
| That depends on what you mean by being "bad at
| statistics." What brains do on a conscious level is very
| different than what they do at a neurobiological level.
| Brains are "bad at statistics" on the conscious level,
| but at the level of neurobiology that's all they do.
|
| As an analogy, consider a professional tennis or baseball
| player. At the neurobiological level those people are
| extremely good at finding solutions to kinematic
| equations, but that doesn't mean that they would ace a
| physics test.
| YeGoblynQueenne wrote:
| That is a very big assumption -that brains have conscious
| and subconscious levels that are good and bad at
| different things- that needs to be itself proved, before
| it can be used to support any other line of inquiry.
|
| I'm not well versed in the relevant literature at all but
| my understanding is that research in the area points to
| the completely opposite direction: that humans e.g.
| playing baseball do not find solutions to kinematic
| equations, but instead use simple heuristics that exploit
| our senses and body configuration, like placing their
| hands in front of their eyes so that they line up with
| the ball etc.
|
| This makes a lot more sense, not only for humans playing
| tennis, but for animals surviving in the wild, finding
| sustenance and shelter, and mates, while avoiding
| becoming a meal. Consider the Portia spider [1], a
| spider-hunting spider, itself prey to other hunting
| spiders, with a brain consisting of a few tens of
| thousands of neurons and still perfectly capable not only
| of navigating complex environments in all three space
| dimensions but also making complex plans involving
| detours.
|
| Just think of how quickly a spider must be able to think
| that hunts, and is hunted by other spiders -some of the
| most deadly predators in the animal kingdom. There is no
| chance of a snowball in hell that such an animal has the
| time to solve kinematic equations with a few KBs of
| neurons. Absolutely no chance at all.
|
| For that and many other stuff like that it looks very
| unlikely to me that human brains, or any brains, are like
| you say. In any case, that sounds positively Freudian and
| I don't mean that as an insult, but I so could.
|
| ______________
|
| [1] My favourite. No, I don't mean meal. I just love this
| paper; it's almost the best paper in autonomous robotics
| and planning that I've ever read:
|
| https://www.frontiersin.org/journals/psychology/articles/
| 10....
| lisper wrote:
| > That is a very big assumption -that brains have
| conscious and subconscious levels that are good and bad
| at different things- that needs to be itself proved,
| before it can be used to support any other line of
| inquiry.
|
| You can't be serious. Do you really doubt that hand-eye
| coordination and solving systems of kinematic equations
| on paper using math are disjoint skills? That one can be
| good at one without being good at the other? That there
| is in actual fact an inverse correlation between these
| skills? How do you account for the fact that even people
| who have never studied math or physics can learn to throw
| and catch a ball?
| YeGoblynQueenne wrote:
| ... because they don't need to use maths or physics?
|
| And yes, I'm serious. Can you please be less
| confrontational?
| lisper wrote:
| Sorry about that, I'm dealing with a troll on another
| thread so I'm on a bit of a hair trigger.
|
| I think we have a fundamental disconnect somewhere, so
| let's try to diagnose it. Where do you start to disagree
| in the following series of claims:
|
| 1. People can have kinematic skills, like throwing and
| catching balls, without having math or physics skills,
| like solving kinematic equations.
|
| 2. In order to have kinematic skills, _something_ in your
| brain must be doing something that can be equated by some
| mapping to solving kinematic equations, because the
| actions that your muscles perform when performing
| kinematic skills _are_ the solutions to kinematic
| equations, so your brain must be producing those (things
| that map to) solutions somehow.
|
| 3. As far as we can tell, brains don't operate
| symbolically at the neurobiological level. Individual
| neurons operate according to laws having to do with
| electrical impulses, synapse firings, neurotransmitters,
| etc. none of which have anything to do with kinematics.
|
| 4. People with kinematic skills generally have only
| limited insight into how they do what they do when they
| apply those skills. Being able to catch a ball doesn't by
| itself give you enough insight to be able to describe to
| someone how to build a machine that would catch a ball.
| But someone with math and physics and engineering skills
| but no kinematic skills (your streotypical geek) could
| plausibly build a machine that could catch a ball much
| better than they themselves could. But the workings of a
| machine built using knowledge of math would almost
| certainly operate in a very different manner than the
| brain of a human with kinematic skills.
|
| I think I'll stop there and ask if there is anything you
| disagree with so far.
| avmich wrote:
| It's great to read conversation of towering HN experts in
| the field.
|
| Lisper, as I understand this part -
|
| > In order to have kinematic skills, something in your
| brain must be doing something that can be equated by some
| mapping to solving kinematic equations
|
| you're talking about an equivalent of YeGoblynQueenne's
|
| > that humans ... do not find solutions to kinematic
| equations, but instead use simple heuristics that exploit
| our senses and body configuration, like placing their
| hands in front of their eyes so that they line up with
| the ball
|
| So to me the question is, is it correct? Can "mapping to
| solve kinematic equation" be the same as "simple
| heuristic... like placing hands in from of eyes"?
|
| Physically this equivalence seems at least plausible.
|
| Now, about
|
| > neurons operate according to laws having to do with
| electrical impulses
|
| - can't we have those kinematic equations solving, or, in
| other words, applying simple heuristics, as a trained
| combination of such neuronal activity?
| mistermann wrote:
| > That is a very big assumption -that brains have
| conscious and subconscious levels that are good and bad
| at different things- that needs to be itself proved,
| _before it can be used to support any other line of
| inquiry_.
|
| Does this assumption itself need to be proven?
|
| Besides, it's not true: you can simply define it as an
| assumption within a thought experiment and proceed
| merrily along, or you can just not bother to consider
| whether one's premises are true in the first place, and
| proceed merrily along.
|
| The second option tends to be more popular in my
| experience, perhaps because it is so much easier, and
| perhaps for some other reasons also.
| admsmz wrote:
| For a moment there I thought it was talking about this high
| school project, https://www.eurisko.us/
| ngvrnd wrote:
| No, the high school project is talking about this.
| monero-xmr wrote:
| And the classic X-Files episode "Ghost in the Machine"
| https://en.wikipedia.org/wiki/Ghost_in_the_Machine_(The_X-Fi...
| dang wrote:
| Related. Others?
|
| _Doug Lenat 's sources for AM (and EURISKO+Traveller?) found in
| public archives_ - https://news.ycombinator.com/item?id=38413615
| - Nov 2023 (9 comments)
|
| _Eurisko Automated Discovery System_ -
| https://news.ycombinator.com/item?id=37355133 - Sept 2023 (1
| comment)
|
| _Why AM and Eurisko Appear to Work (1983) [pdf]_ -
| https://news.ycombinator.com/item?id=28343118 - Aug 2021 (17
| comments)
|
| _Early AI: "Eurisko, the Computer with a Mind of Its Own"
| (1984)_ - https://news.ycombinator.com/item?id=27298167 - May
| 2021 (2 comments)
|
| _Some documents on AM and EURISKO_ -
| https://news.ycombinator.com/item?id=18443607 - Nov 2018 (10
| comments)
|
| _Why AM and Eurisko Appear to Work (1983) [pdf]_ -
| https://news.ycombinator.com/item?id=9750349 - June 2015 (5
| comments)
|
| _Why AM and Eurisko Appear to Work (1984) [pdf]_ -
| https://news.ycombinator.com/item?id=8219681 - Aug 2014 (2
| comments)
|
| _Eurisko, The Computer With A Mind Of Its Own_ -
| https://news.ycombinator.com/item?id=2111826 - Jan 2011 (9
| comments)
|
| _Let 's reimplement Eurisko_ -
| https://news.ycombinator.com/item?id=656380 - June 2009 (25
| comments)
|
| _Eurisko, The Computer With A Mind Of Its Own_ -
| https://news.ycombinator.com/item?id=396796 - Dec 2008 (13
| comments)
| slavboj wrote:
| EURISKO is basically a series of genetic algorithms over lisp
| code - the homoiconic nature of lisp making it effectively a
| meta-optimizer. Amongst many problems was that the solution
| space, even for things like "be interesting and true", was way
| too large.
| nxobject wrote:
| Random funfact I didn't anticipate learning: Eurisko ran on Altos
| as well. Talk about a resource-constrained environment...
___________________________________________________________________
(page generated 2024-04-23 23:00 UTC)