[HN Gopher] Claude Shannon Demonstrates Machine Learning (1952)
___________________________________________________________________
Claude Shannon Demonstrates Machine Learning (1952)
Author : jchallis
Score : 108 points
Date : 2021-01-26 17:56 UTC (5 hours ago)
(HTM) web link (techchannel.att.com)
(TXT) w3m dump (techchannel.att.com)
| dang wrote:
| If curious see also
|
| 2014 https://news.ycombinator.com/item?id=7758547
|
| I think there have been other threads too?
| donquichotte wrote:
| What a grave misunderstanding! What he is demonstrating is
| remembering, or memory.
|
| It appears that the state of the mouse is [X, Y, Heading], where
| X and Y are discrete positions in the available cells.
|
| Once the maze has been solved by "a rather involved strategy of
| trial and error", the solution is saved, and for any of the known
| states, the mouse "remembers" its way to the exit of the maze.
| coldtea wrote:
| All learning amounts to remembering. A machine learning model
| is basically remembered weights. Everything we've learned is
| what we remember as solidified in our neurons, etc.
|
| Note that learning is not about discovering yourself. You also
| learn if you're capable of remembering (and optionally
| applying) something that somebody else taught you. Students
| e.g. learn a formula for X, the periodic table, etc.
|
| And this contraption not only remembers (that is, learns), but
| also has an algorithm to figure the solution in the first place
| (that is, tries and discovers).
|
| So the title "machine learning" or even "AI" is perfectly
| legitimate. It's not GAI, but it's also not 2020 (not that we
| have GAI in 2020).
| CodeGlitch wrote:
| 2021 :)
| coldtea wrote:
| Heh, did 2020 really happen?
| Jugurtha wrote:
| > _All learning amounts to remembering. A machine learning
| model is basically remembered weights. Everything we 've
| learned is what we remember as solidified in our neurons,
| etc._
|
| That's "not even wrong", as Pauli would say. Your paragraph
| suffers from using a shaky non inversible analogy:
|
| Machine learning often uses an analogy for the brain,
| neurons, activation functions, etc. Some accuracy about the
| real world is sacrificed in that analogy for the sake of
| being useful and to have _something_ to reason with and
| shared taxonomy. We accept that loss for the sake of being
| productive and for lack of actual equivalents.
|
| What your first paragraph did is use that analogy of the
| brain used in machine learning, that is shaky to begin with,
| and use it to reason about the biological brain as if we did
| not have the actual thing.
|
| In other words, we had a biological brain that we clumsily
| modeled to get work done in ML, and the paragraph used that
| model to reason about the brain itself. Similar to how you
| translate from French --> English --> French and get a
| different output than the input.
|
| Remembering certainly plays a role in learning, though it is
| but one component. For it to _be_ what learning _is_ ,
| everything has to be exactly the same with every instance.
|
| To use the analogy, a machine learning model returns
| predictions/output for instances it has not necessarily seen.
| Our brain produces outputs based on situations that had not
| yet happened, at least not in the 'anisotropic time'
| universe, before that.
|
| What do you think?
| coldtea wrote:
| > _Your paragraph suffers from using a shaky non inversible
| analogy_
|
| That's neither here nor there though. The brain doesn't
| have to follow a specific model (e.g. the shaky model ML
| uses, and I alluded to), for the analogy to work.
|
| It just has to have memory and remembrance as a core
| attribute of learning, which it does.
|
| Whether this happens through neurons and weights or some
| other mechanism, it's still remembering. Your counter-
| argument focused on implementation details.
|
| > _Remembering certainly plays a role in learning, though
| it is but one component. For it to be what learning is,
| everything has to be exactly the same with every instance._
|
| Well, remembering doesn't just "play a role", it plays the
| biggest role in learning. It's not even remotely optional:
| without remembering there's no learning.
|
| And inversely, discovering ourselves what we learn or
| applying it are both optional. I can listen to a professor
| tell me about something someone else discovered, and never
| apply it, but as long as I remember that information, I
| have learned it.
|
| And, of course, as I already wrote, the contraption doesn't
| just remember but also discovers the solution (and can even
| re-apply it).
| Jugurtha wrote:
| > _Well, remembering doesn 't just "play a role", it
| plays the biggest role in learning. It's not even
| remotely optional: without remembering there's no
| learning._
|
| This is different from the phrase in your first paragraph
| that stated:
|
| > _All learning amounts to remembering._
|
| There is a difference between saying that remembering is
| a necessary condition for learning, and saying that
| remembering _is_ learning.
|
| Memory plays a role in learning. Does it play the biggest
| role? Let's assume it does. Is learning only memory? I
| don't think so. Did you not mean to say that all learning
| is remembering and but wrote so even though your thoughts
| on that are more nuanced? Probably.
|
| > _And inversely, discovering ourselves what we learn or
| applying are both optional. I can listen to a professor
| tell me about something someone else discovered, and
| never apply it, but as long as I remember that
| information, I have learned it._
|
| See, here again I'll have to disagree. I'm looking at it
| from the standpoint of output and outcome of a system
| where that output is not simply information retrieval.
|
| Let's say the information is about driving. I can have a
| driving manual memorized. Can I remember that information
| about driving? Yes. Have I "learned" driving? No.
| coldtea wrote:
| > _Memory plays a role in learning. Does it play the
| biggest role? Let 's assume it does. Is learning only
| memory? I don't think so._
|
| The end result of learning (having learnt) is, I'd say,
| only memory.
|
| If the memory is of static information or dynamic state
| (the difference between a file with data and a program's
| runtime loaded with state) it's still however just
| memory.
|
| What else would it be?
|
| Sure, the process of learning, on the other hand, can
| take different forms. Those would be analogous to loading
| a memory store via different mechanisms.
|
| > _Let 's say the information is about driving. I can
| have a driving manual memorized. Can I remember that
| information about driving? Yes. Have I "learned" driving?
| No._
|
| Let's say we can distinguish between two types of
| learnable things.
|
| When it comes to information from the digits of pie or
| the lyrics of a song, or the periodic table and many
| other things, having such things memorized is enough for
| us to say we "learned them".
|
| Your example with driving, however, is not about mere
| learning some information, but about learning an
| activity. In your example we can surely say you've
| learned the contents of the driving manual. That's not
| the same as learning to drive, but nobody said it is. It
| is still learning the information within the manual,
| though.
|
| Now, for learning to drive one would need to learn
| some/all of the same information (perhaps in another
| form, simpler, more colloqual, etc), and also the motions
| involved, how to respond to various road situations, etc.
| This however is a "loading the memory part".
|
| Isn't the end result the same though? Information
| (somehow, not really relevant how) stored in the brain of
| the person that learned to drive (including things like
| "muscle memory")? The process is not the same as
| memorizing static information (e.g. a piece of text), but
| the end result is still a brain that has a new state,
| similar to an RAM or SSD that has a new state.
|
| See also my point above regarding static vs dynamic
| memory.
| recursivedoubts wrote:
| I mean, what did Claude Shannon know about computers or
| anything else?
| [deleted]
| jchallis wrote:
| By any reasonable definition of learning, the mouse adapts to a
| new situation.
| ARandomerDude wrote:
| 1952. After 70 years, millions of people, and billions of
| dollars, it makes sense that our algorithms are more
| sophisticated and our definitions more precise.
| optimuspaul wrote:
| sounds like learning to me. In the 1950s this was somewhat
| advanced machine learning.
| witherk wrote:
| I kinda thought the point of the title was to show how what
| counts as AI is just the newest form a programming.
| [deleted]
| DaniloDias wrote:
| When standing on the shoulders of giants, it's best to be
| compassionate in evaluating the implementations of the past.
| blackrock wrote:
| How did he store the memory? Using some kind of small gearings
| inside the mouse?
| jchallis wrote:
| Electromechanical relays below the maze. Phone switches at the
| phone company labs.
| YeGoblynQueenne wrote:
| The relays are behind the maze- what's under the maze is a
| couple of arms with electromagnets that move to drag the
| mouse around the maze.
|
| In fact the "relays" are an electromechanical computer. I
| don't know much about the computers of that era, but since
| this is in Bell Labs in the early '50s, it may have been the
| model VI:
|
| https://en.wikipedia.org/wiki/Model_V#Model_VI
| mywittyname wrote:
| I couldn't find any design details online for how this was
| built. I'd love to read more technical details.
|
| Since this is purely mechanical, it is probably extremely
| simple in design. I suspect that there are two relays
| (representing 2 bits) under each square's reed switch,
| encoding the number of "left turns" to make upon entering
| that square. The whiskers probably trigger a simple
| incrementer for each square to keep track of the number of
| left turns to make. When the mouse is dropped in a space, it
| probably turns left to reorient itself, but then knows the
| direction to walk once it reaches a point in the path where
| it has solved before because the number of left turns it
| needs to make in that space is already solved.
| [deleted]
| 2mol wrote:
| I recently watched a charming documentary on Shannon, called The
| Bit Player [0]. The film has some annoying flaws, but it really
| highlights how much Shannon was outside the box for a
| mathematician. He seems to have been extremely playful, and
| tinkering with hardware projects throughout his life.
|
| How he even managed to make a (very limited) endgame chess
| computer [1] ~70 years ago still blows my mind.
|
| [0] https://thebitplayer.com/
|
| [1] https://www.chess.com/article/view/the-man-who-built-the-
| che...
| jointpdf wrote:
| Also, don't miss Turing's paper describing how to build a chess
| engine from scratch. The link to the .pdf is at the bottom of
| this contextual blurb:
|
| https://historyofinformation.com/detail.php?id=3905
| sarthakjain wrote:
| It's interesting to note why this was considered AI in 1952 and
| some may not consider it to be AI today. The AI was the search
| algorithm to find an effecient solution to the maze, not the
| mouse being able to navigate it later in a second run. The second
| run was just a demonstration of it having found the solution
| demonstrating it's intelligence. The actual intelligence was it's
| first run through the maze. Almost any configuration of the maze
| could be solved using algorithms like depth first, breadth first
| and a star search (didn't check which one the video
| demonstrates). Even though the algorithm was trivial it's ability
| to be applied to problems of today is still extraordinary.
| Nerural networks being equally trivial algorithms capable of
| remarkable things. Id argue this is as much AI today as it was
| back then, just more people know how Shannon performed this magic
| trick.
| sesqu wrote:
| Shannon did not use the word intelligence to describe the mouse
| in this demonstration - instead, he talked about learning.
| That's why the second run was considered more important than
| whatever algorithm was used to solve the maze.
|
| To that end, I'm curious about their cache invalidation
| solution. Are there timestamps, or is it a flag system?
| gnramires wrote:
| > I'm curious about their cache invalidation solution
|
| My guess: there would be a model somewhere (probably a binary
| relay map of walls) of the maze, and as soon as the mouse
| hits an inconsistency, this map is updated. So there isn't
| really a cache, it's more like a model, or perhaps you can
| think of collision-based cache (model) invalidation. The
| mouse probably then follows the solution to this modified
| maze, modified only insofar as it has measured modifications.
| mywittyname wrote:
| You are being far, far, far too generous with the complexity
| of this design if you think there's some kind of cache
| invalidation. It's a purely mechanical computer, which means
| it is going to be very simple in abstract design, because
| doing anything even mildly complex would require an insane
| amount of space.
|
| I can't find design documents for this, but I can make a
| pretty educated guess about its design.
|
| Each square has two relays, representing the number of left
| turns necessary to exit the square. Each time a whisker
| touches a wall, a signal is sent to a mechanical adder which
| will add 1 to the relays in the space. When the mouse enters
| a square, a "register" is set with a value, based on if it
| entered from the left, top, right, or bottom, then the mouse
| is turned and the register decremented until it hit 0, then
| the mouse attempts to walk in the indicated direction.
|
| The maze ends up looking something like this:
| +-----+ |0|1 1| +-- - + |1 3|0|
| + --- + |1 3|x| +-- --+
|
| Where the mice starts on x and turns the number of times in
| each square. You can actually put the mouse down anywhere and
| it will exit the maze, if the walls are left unchanged.
| YeGoblynQueenne wrote:
| >> The AI was the search algorithm to find an effecient
| solution to the maze, not the mouse being able to navigate it
| later in a second run.
|
| But that's not the whole story! The program can update its
| solution of the maze when the maze changes, but it is capable
| of only changing that part of the solution that has actually
| changed. When Shannon changes the maze and places Theseus in
| the modified part of the maze, I kind of rolled my eyes, sure
| that it was going to start a new search, all over again, but I
| was wrong: it searches until it finds where the unmodified part
| of the maze begins, then it continues on the path it learned
| before.
|
| It seems that, in solving the maze, the program is building
| some kind of model of its world, that it can then manipulate
| with economy. For comparison, neural nets cannot update their
| models - when the world changes, a neural net can only train
| its model all over again, from scratch, just like I thought
| Theseus would start a whole new search when Shannon changed the
| maze. And neural nets can certainly not update _parts_ of their
| models!
|
| This demonstration looks primitive because everything is so old
| (a computer made with telephone relays!), but it's actually
| attacking problems that continue to tie AI systems of today
| into knots. It is certainly AI. And, in "early 1950's", it's AI
| _avant la lettre_.
| salty_biscuits wrote:
| I always say that AI is a forever moving goal post. It is
| simply a task a human can do that you wouldn't expect a machine
| to be able to do. So as soon as a machine can do it, people no
| longer consider it intelligent (i.e. it is just A*, it is just
| a chess engine, it is just a network picking up on patches of
| texture, ..., it isn't really "intelligent").
| Jasper_ wrote:
| This is because we originally thought "only a human would be
| able to play chess", "only a human would be able to drive a
| car". The thinking there is that if we were to solve these
| problems, we'd _have_ to get closer to a true artificial
| intelligence (the kind that today we 'd call "AGI" because
| "AI" doesn't mean anything anymore).
|
| This line of thinking has been shown to be pretty faulty.
| We've come up with engines and algorithms that can play Go
| and Chess, but we aren't any closer to anything that
| resembles a general intelligence.
| mkl95 wrote:
| A star search as we know it wasn't developed until the mid 60s.
| peter303 wrote:
| The term A.I. was coined four years later in 1956. But an
| earlier term cybernetics encompassed some aspects of A.I.
| ExcavateGrandMa wrote:
| 1952 aye karamba!
|
| Am wondering, if this an an undepletable resource?
|
| KrkrkkRrkrkrkrkrkrk!
| jzer0cool wrote:
| Here is our current modern version of mouse maze competition:
| https://youtu.be/NqdZ9wbXt8k?t=86 ... its so smooth. Would love
| to learn more how to build one of these if anyone has suggestions
| starting with basic servos for a light introduction.
| tachyonbeam wrote:
| I don't think you'd want to use servos to drive the wheels for
| something like this. You probably want very fast steppers (so
| you can get precise rotations) or geared DC motors (so you can
| move very fast). I'm assuming the robot has some very basic
| black and white computer vision, or maybe even simple laser
| range finders, which would be simpler to work with and more
| likely to generalize to new mazes.
|
| Here is an example of an inexpensive laser range sensor:
| https://www.ebay.com/itm/TOF10120-ToF-Laser-Range-Sensor-Las...
| jazzyjackson wrote:
| What a wonderful era of creativity in machine design. Building a
| machine to reproduce the behavior of living systems goes a long
| way back, and this mouse acting like a mouse makes me think of a
| great documentary on automata, spending some time talking about
| the people and politics around building these fantastically
| expensive robotic birds and swans and whole towns driven by a
| clock. [1]
|
| Building a machine that mimics neural processing is just a
| continuation of that tradition. One other machine that amazed me
| is Paul Westons's numa-rete, an early attempt at neural networks
| composed of 400 "synapses" which can count the number of objects
| on its surface without any CPU to coordinate it, just direct
| parallel analog computation. A good explanation is here [2], this
| was in 1961.
|
| `What will be done is to show what is inside, revealing how
| little intelligence it really had. However, many uninitiated
| persons who tried and failed at the time to "trick" it into error
| by, e.g., inserting objects within holes in larger objects were
| ready to believe that it was intelligent`
|
| [1] https://youtu.be/NcPA0jvp9IQ
|
| [2]
| https://web.archive.org/web/20160401023457/http://bcl.ece.il...
| YeGoblynQueenne wrote:
| From the title I was sure that this would be some horrible misuse
| of "machine learning" to stand-in for "AI" but while Shannon
| himself never uses the term in the video, the title seems
| accurate enough.
|
| To summarise: a model mouse, named Theseus, is moved around a
| maze by a system of electromagnets placed under the maze and
| guided by an electromechanical computer (made with telephone
| relays). The computer is from Bell Labs and given the date listed
| for the video ("early 1950's") it may be a Model VI.
|
| The program that drives the mouse is a combination of a search
| algorithm with some kind of modular memory. The mouse first finds
| its way around the maze by successivly trying a set of four moves
| at right angles. The sequence seems to be: move forward; if
| that's impossible, turn 90deg CW; repeat. This eventually finds a
| path to the end of the maze, where a goal is placed (possibly
| magnetic). In more modern terms, the program learns a plan to
| navigate the maze. Once the plan is learned it can be re-used any
| number of times to solve the maze without any more searching.
|
| The interesting thing is that, if the maze changes, the program
| only needs to replace _part_ of its learned plan. Shannon
| demonstrates this by moving around some of the walls of the maze,
| which are modular segments. Theseus initially enters its
| searching routine, until it finds its way to the part of the maze
| that it knows already. After that, it 's smooth sailing,
| following the plan it already knows.
|
| I have no idea what that technique is that does that. If I
| guessed, I'd guess that the program has a model of its world -a
| map of the maze- or, more interestingly, it builds one as it goes
| along. When the maze changes, the program updates its model, but
| it is able to update only as much of the model as corresponds to
| the part of the world that has really changed. This suggests some
| kind of object permanence assumption, an encoding of the
| knowledge that if the state of some entity in the world is not
| known to have changed, then it has not changed. Or, in other
| words, some solution to the frame problem (which was actually
| described about a decade later than Shannon's demonstration).
|
| Note well that modern AI systems very rarely exhibit this
| ability. For instance, deep neural networks _cannot update their
| models_ - not partly, not wholesale. Like I say, I have no idea
| what the approach is that is demonstrated. It may be something
| well-known but rarely used these days. It may be something that
| Shannon came up with before John McCarthy fathered AI and so it
| never really caught on as an AI approach. If anyone knows, please
| do tell, I 'm very curious.
|
| In any case "Theseus" (rather, its AI) seems to exhibit what
| McCarthy called "elaboration tolerance", making formal
| representations of facts that can be modified easily:
|
| http://jmc.stanford.edu/articles/elaboration.html
___________________________________________________________________
(page generated 2021-01-26 23:00 UTC)