[HN Gopher] Claude Shannon Demonstrates Machine Learning (1952)
       ___________________________________________________________________
        
       Claude Shannon Demonstrates Machine Learning (1952)
        
       Author : jchallis
       Score  : 108 points
       Date   : 2021-01-26 17:56 UTC (5 hours ago)
        
 (HTM) web link (techchannel.att.com)
 (TXT) w3m dump (techchannel.att.com)
        
       | dang wrote:
       | If curious see also
       | 
       | 2014 https://news.ycombinator.com/item?id=7758547
       | 
       | I think there have been other threads too?
        
       | donquichotte wrote:
       | What a grave misunderstanding! What he is demonstrating is
       | remembering, or memory.
       | 
       | It appears that the state of the mouse is [X, Y, Heading], where
       | X and Y are discrete positions in the available cells.
       | 
       | Once the maze has been solved by "a rather involved strategy of
       | trial and error", the solution is saved, and for any of the known
       | states, the mouse "remembers" its way to the exit of the maze.
        
         | coldtea wrote:
         | All learning amounts to remembering. A machine learning model
         | is basically remembered weights. Everything we've learned is
         | what we remember as solidified in our neurons, etc.
         | 
         | Note that learning is not about discovering yourself. You also
         | learn if you're capable of remembering (and optionally
         | applying) something that somebody else taught you. Students
         | e.g. learn a formula for X, the periodic table, etc.
         | 
         | And this contraption not only remembers (that is, learns), but
         | also has an algorithm to figure the solution in the first place
         | (that is, tries and discovers).
         | 
         | So the title "machine learning" or even "AI" is perfectly
         | legitimate. It's not GAI, but it's also not 2020 (not that we
         | have GAI in 2020).
        
           | CodeGlitch wrote:
           | 2021 :)
        
             | coldtea wrote:
             | Heh, did 2020 really happen?
        
           | Jugurtha wrote:
           | > _All learning amounts to remembering. A machine learning
           | model is basically remembered weights. Everything we 've
           | learned is what we remember as solidified in our neurons,
           | etc._
           | 
           | That's "not even wrong", as Pauli would say. Your paragraph
           | suffers from using a shaky non inversible analogy:
           | 
           | Machine learning often uses an analogy for the brain,
           | neurons, activation functions, etc. Some accuracy about the
           | real world is sacrificed in that analogy for the sake of
           | being useful and to have _something_ to reason with and
           | shared taxonomy. We accept that loss for the sake of being
           | productive and for lack of actual equivalents.
           | 
           | What your first paragraph did is use that analogy of the
           | brain used in machine learning, that is shaky to begin with,
           | and use it to reason about the biological brain as if we did
           | not have the actual thing.
           | 
           | In other words, we had a biological brain that we clumsily
           | modeled to get work done in ML, and the paragraph used that
           | model to reason about the brain itself. Similar to how you
           | translate from French --> English --> French and get a
           | different output than the input.
           | 
           | Remembering certainly plays a role in learning, though it is
           | but one component. For it to _be_ what learning _is_ ,
           | everything has to be exactly the same with every instance.
           | 
           | To use the analogy, a machine learning model returns
           | predictions/output for instances it has not necessarily seen.
           | Our brain produces outputs based on situations that had not
           | yet happened, at least not in the 'anisotropic time'
           | universe, before that.
           | 
           | What do you think?
        
             | coldtea wrote:
             | > _Your paragraph suffers from using a shaky non inversible
             | analogy_
             | 
             | That's neither here nor there though. The brain doesn't
             | have to follow a specific model (e.g. the shaky model ML
             | uses, and I alluded to), for the analogy to work.
             | 
             | It just has to have memory and remembrance as a core
             | attribute of learning, which it does.
             | 
             | Whether this happens through neurons and weights or some
             | other mechanism, it's still remembering. Your counter-
             | argument focused on implementation details.
             | 
             | > _Remembering certainly plays a role in learning, though
             | it is but one component. For it to be what learning is,
             | everything has to be exactly the same with every instance._
             | 
             | Well, remembering doesn't just "play a role", it plays the
             | biggest role in learning. It's not even remotely optional:
             | without remembering there's no learning.
             | 
             | And inversely, discovering ourselves what we learn or
             | applying it are both optional. I can listen to a professor
             | tell me about something someone else discovered, and never
             | apply it, but as long as I remember that information, I
             | have learned it.
             | 
             | And, of course, as I already wrote, the contraption doesn't
             | just remember but also discovers the solution (and can even
             | re-apply it).
        
               | Jugurtha wrote:
               | > _Well, remembering doesn 't just "play a role", it
               | plays the biggest role in learning. It's not even
               | remotely optional: without remembering there's no
               | learning._
               | 
               | This is different from the phrase in your first paragraph
               | that stated:
               | 
               | > _All learning amounts to remembering._
               | 
               | There is a difference between saying that remembering is
               | a necessary condition for learning, and saying that
               | remembering _is_ learning.
               | 
               | Memory plays a role in learning. Does it play the biggest
               | role? Let's assume it does. Is learning only memory? I
               | don't think so. Did you not mean to say that all learning
               | is remembering and but wrote so even though your thoughts
               | on that are more nuanced? Probably.
               | 
               | > _And inversely, discovering ourselves what we learn or
               | applying are both optional. I can listen to a professor
               | tell me about something someone else discovered, and
               | never apply it, but as long as I remember that
               | information, I have learned it._
               | 
               | See, here again I'll have to disagree. I'm looking at it
               | from the standpoint of output and outcome of a system
               | where that output is not simply information retrieval.
               | 
               | Let's say the information is about driving. I can have a
               | driving manual memorized. Can I remember that information
               | about driving? Yes. Have I "learned" driving? No.
        
               | coldtea wrote:
               | > _Memory plays a role in learning. Does it play the
               | biggest role? Let 's assume it does. Is learning only
               | memory? I don't think so._
               | 
               | The end result of learning (having learnt) is, I'd say,
               | only memory.
               | 
               | If the memory is of static information or dynamic state
               | (the difference between a file with data and a program's
               | runtime loaded with state) it's still however just
               | memory.
               | 
               | What else would it be?
               | 
               | Sure, the process of learning, on the other hand, can
               | take different forms. Those would be analogous to loading
               | a memory store via different mechanisms.
               | 
               | > _Let 's say the information is about driving. I can
               | have a driving manual memorized. Can I remember that
               | information about driving? Yes. Have I "learned" driving?
               | No._
               | 
               | Let's say we can distinguish between two types of
               | learnable things.
               | 
               | When it comes to information from the digits of pie or
               | the lyrics of a song, or the periodic table and many
               | other things, having such things memorized is enough for
               | us to say we "learned them".
               | 
               | Your example with driving, however, is not about mere
               | learning some information, but about learning an
               | activity. In your example we can surely say you've
               | learned the contents of the driving manual. That's not
               | the same as learning to drive, but nobody said it is. It
               | is still learning the information within the manual,
               | though.
               | 
               | Now, for learning to drive one would need to learn
               | some/all of the same information (perhaps in another
               | form, simpler, more colloqual, etc), and also the motions
               | involved, how to respond to various road situations, etc.
               | This however is a "loading the memory part".
               | 
               | Isn't the end result the same though? Information
               | (somehow, not really relevant how) stored in the brain of
               | the person that learned to drive (including things like
               | "muscle memory")? The process is not the same as
               | memorizing static information (e.g. a piece of text), but
               | the end result is still a brain that has a new state,
               | similar to an RAM or SSD that has a new state.
               | 
               | See also my point above regarding static vs dynamic
               | memory.
        
         | recursivedoubts wrote:
         | I mean, what did Claude Shannon know about computers or
         | anything else?
        
         | [deleted]
        
         | jchallis wrote:
         | By any reasonable definition of learning, the mouse adapts to a
         | new situation.
        
         | ARandomerDude wrote:
         | 1952. After 70 years, millions of people, and billions of
         | dollars, it makes sense that our algorithms are more
         | sophisticated and our definitions more precise.
        
         | optimuspaul wrote:
         | sounds like learning to me. In the 1950s this was somewhat
         | advanced machine learning.
        
         | witherk wrote:
         | I kinda thought the point of the title was to show how what
         | counts as AI is just the newest form a programming.
        
         | [deleted]
        
         | DaniloDias wrote:
         | When standing on the shoulders of giants, it's best to be
         | compassionate in evaluating the implementations of the past.
        
       | blackrock wrote:
       | How did he store the memory? Using some kind of small gearings
       | inside the mouse?
        
         | jchallis wrote:
         | Electromechanical relays below the maze. Phone switches at the
         | phone company labs.
        
           | YeGoblynQueenne wrote:
           | The relays are behind the maze- what's under the maze is a
           | couple of arms with electromagnets that move to drag the
           | mouse around the maze.
           | 
           | In fact the "relays" are an electromechanical computer. I
           | don't know much about the computers of that era, but since
           | this is in Bell Labs in the early '50s, it may have been the
           | model VI:
           | 
           | https://en.wikipedia.org/wiki/Model_V#Model_VI
        
           | mywittyname wrote:
           | I couldn't find any design details online for how this was
           | built. I'd love to read more technical details.
           | 
           | Since this is purely mechanical, it is probably extremely
           | simple in design. I suspect that there are two relays
           | (representing 2 bits) under each square's reed switch,
           | encoding the number of "left turns" to make upon entering
           | that square. The whiskers probably trigger a simple
           | incrementer for each square to keep track of the number of
           | left turns to make. When the mouse is dropped in a space, it
           | probably turns left to reorient itself, but then knows the
           | direction to walk once it reaches a point in the path where
           | it has solved before because the number of left turns it
           | needs to make in that space is already solved.
        
       | [deleted]
        
       | 2mol wrote:
       | I recently watched a charming documentary on Shannon, called The
       | Bit Player [0]. The film has some annoying flaws, but it really
       | highlights how much Shannon was outside the box for a
       | mathematician. He seems to have been extremely playful, and
       | tinkering with hardware projects throughout his life.
       | 
       | How he even managed to make a (very limited) endgame chess
       | computer [1] ~70 years ago still blows my mind.
       | 
       | [0] https://thebitplayer.com/
       | 
       | [1] https://www.chess.com/article/view/the-man-who-built-the-
       | che...
        
         | jointpdf wrote:
         | Also, don't miss Turing's paper describing how to build a chess
         | engine from scratch. The link to the .pdf is at the bottom of
         | this contextual blurb:
         | 
         | https://historyofinformation.com/detail.php?id=3905
        
       | sarthakjain wrote:
       | It's interesting to note why this was considered AI in 1952 and
       | some may not consider it to be AI today. The AI was the search
       | algorithm to find an effecient solution to the maze, not the
       | mouse being able to navigate it later in a second run. The second
       | run was just a demonstration of it having found the solution
       | demonstrating it's intelligence. The actual intelligence was it's
       | first run through the maze. Almost any configuration of the maze
       | could be solved using algorithms like depth first, breadth first
       | and a star search (didn't check which one the video
       | demonstrates). Even though the algorithm was trivial it's ability
       | to be applied to problems of today is still extraordinary.
       | Nerural networks being equally trivial algorithms capable of
       | remarkable things. Id argue this is as much AI today as it was
       | back then, just more people know how Shannon performed this magic
       | trick.
        
         | sesqu wrote:
         | Shannon did not use the word intelligence to describe the mouse
         | in this demonstration - instead, he talked about learning.
         | That's why the second run was considered more important than
         | whatever algorithm was used to solve the maze.
         | 
         | To that end, I'm curious about their cache invalidation
         | solution. Are there timestamps, or is it a flag system?
        
           | gnramires wrote:
           | > I'm curious about their cache invalidation solution
           | 
           | My guess: there would be a model somewhere (probably a binary
           | relay map of walls) of the maze, and as soon as the mouse
           | hits an inconsistency, this map is updated. So there isn't
           | really a cache, it's more like a model, or perhaps you can
           | think of collision-based cache (model) invalidation. The
           | mouse probably then follows the solution to this modified
           | maze, modified only insofar as it has measured modifications.
        
           | mywittyname wrote:
           | You are being far, far, far too generous with the complexity
           | of this design if you think there's some kind of cache
           | invalidation. It's a purely mechanical computer, which means
           | it is going to be very simple in abstract design, because
           | doing anything even mildly complex would require an insane
           | amount of space.
           | 
           | I can't find design documents for this, but I can make a
           | pretty educated guess about its design.
           | 
           | Each square has two relays, representing the number of left
           | turns necessary to exit the square. Each time a whisker
           | touches a wall, a signal is sent to a mechanical adder which
           | will add 1 to the relays in the space. When the mouse enters
           | a square, a "register" is set with a value, based on if it
           | entered from the left, top, right, or bottom, then the mouse
           | is turned and the register decremented until it hit 0, then
           | the mouse attempts to walk in the indicated direction.
           | 
           | The maze ends up looking something like this:
           | +-----+         |0|1 1|         +-- - +         |1 3|0|
           | + --- +         |1 3|x|         +-- --+
           | 
           | Where the mice starts on x and turns the number of times in
           | each square. You can actually put the mouse down anywhere and
           | it will exit the maze, if the walls are left unchanged.
        
         | YeGoblynQueenne wrote:
         | >> The AI was the search algorithm to find an effecient
         | solution to the maze, not the mouse being able to navigate it
         | later in a second run.
         | 
         | But that's not the whole story! The program can update its
         | solution of the maze when the maze changes, but it is capable
         | of only changing that part of the solution that has actually
         | changed. When Shannon changes the maze and places Theseus in
         | the modified part of the maze, I kind of rolled my eyes, sure
         | that it was going to start a new search, all over again, but I
         | was wrong: it searches until it finds where the unmodified part
         | of the maze begins, then it continues on the path it learned
         | before.
         | 
         | It seems that, in solving the maze, the program is building
         | some kind of model of its world, that it can then manipulate
         | with economy. For comparison, neural nets cannot update their
         | models - when the world changes, a neural net can only train
         | its model all over again, from scratch, just like I thought
         | Theseus would start a whole new search when Shannon changed the
         | maze. And neural nets can certainly not update _parts_ of their
         | models!
         | 
         | This demonstration looks primitive because everything is so old
         | (a computer made with telephone relays!), but it's actually
         | attacking problems that continue to tie AI systems of today
         | into knots. It is certainly AI. And, in "early 1950's", it's AI
         | _avant la lettre_.
        
         | salty_biscuits wrote:
         | I always say that AI is a forever moving goal post. It is
         | simply a task a human can do that you wouldn't expect a machine
         | to be able to do. So as soon as a machine can do it, people no
         | longer consider it intelligent (i.e. it is just A*, it is just
         | a chess engine, it is just a network picking up on patches of
         | texture, ..., it isn't really "intelligent").
        
           | Jasper_ wrote:
           | This is because we originally thought "only a human would be
           | able to play chess", "only a human would be able to drive a
           | car". The thinking there is that if we were to solve these
           | problems, we'd _have_ to get closer to a true artificial
           | intelligence (the kind that today we 'd call "AGI" because
           | "AI" doesn't mean anything anymore).
           | 
           | This line of thinking has been shown to be pretty faulty.
           | We've come up with engines and algorithms that can play Go
           | and Chess, but we aren't any closer to anything that
           | resembles a general intelligence.
        
         | mkl95 wrote:
         | A star search as we know it wasn't developed until the mid 60s.
        
         | peter303 wrote:
         | The term A.I. was coined four years later in 1956. But an
         | earlier term cybernetics encompassed some aspects of A.I.
        
       | ExcavateGrandMa wrote:
       | 1952 aye karamba!
       | 
       | Am wondering, if this an an undepletable resource?
       | 
       | KrkrkkRrkrkrkrkrkrk!
        
       | jzer0cool wrote:
       | Here is our current modern version of mouse maze competition:
       | https://youtu.be/NqdZ9wbXt8k?t=86 ... its so smooth. Would love
       | to learn more how to build one of these if anyone has suggestions
       | starting with basic servos for a light introduction.
        
         | tachyonbeam wrote:
         | I don't think you'd want to use servos to drive the wheels for
         | something like this. You probably want very fast steppers (so
         | you can get precise rotations) or geared DC motors (so you can
         | move very fast). I'm assuming the robot has some very basic
         | black and white computer vision, or maybe even simple laser
         | range finders, which would be simpler to work with and more
         | likely to generalize to new mazes.
         | 
         | Here is an example of an inexpensive laser range sensor:
         | https://www.ebay.com/itm/TOF10120-ToF-Laser-Range-Sensor-Las...
        
       | jazzyjackson wrote:
       | What a wonderful era of creativity in machine design. Building a
       | machine to reproduce the behavior of living systems goes a long
       | way back, and this mouse acting like a mouse makes me think of a
       | great documentary on automata, spending some time talking about
       | the people and politics around building these fantastically
       | expensive robotic birds and swans and whole towns driven by a
       | clock. [1]
       | 
       | Building a machine that mimics neural processing is just a
       | continuation of that tradition. One other machine that amazed me
       | is Paul Westons's numa-rete, an early attempt at neural networks
       | composed of 400 "synapses" which can count the number of objects
       | on its surface without any CPU to coordinate it, just direct
       | parallel analog computation. A good explanation is here [2], this
       | was in 1961.
       | 
       | `What will be done is to show what is inside, revealing how
       | little intelligence it really had. However, many uninitiated
       | persons who tried and failed at the time to "trick" it into error
       | by, e.g., inserting objects within holes in larger objects were
       | ready to believe that it was intelligent`
       | 
       | [1] https://youtu.be/NcPA0jvp9IQ
       | 
       | [2]
       | https://web.archive.org/web/20160401023457/http://bcl.ece.il...
        
       | YeGoblynQueenne wrote:
       | From the title I was sure that this would be some horrible misuse
       | of "machine learning" to stand-in for "AI" but while Shannon
       | himself never uses the term in the video, the title seems
       | accurate enough.
       | 
       | To summarise: a model mouse, named Theseus, is moved around a
       | maze by a system of electromagnets placed under the maze and
       | guided by an electromechanical computer (made with telephone
       | relays). The computer is from Bell Labs and given the date listed
       | for the video ("early 1950's") it may be a Model VI.
       | 
       | The program that drives the mouse is a combination of a search
       | algorithm with some kind of modular memory. The mouse first finds
       | its way around the maze by successivly trying a set of four moves
       | at right angles. The sequence seems to be: move forward; if
       | that's impossible, turn 90deg CW; repeat. This eventually finds a
       | path to the end of the maze, where a goal is placed (possibly
       | magnetic). In more modern terms, the program learns a plan to
       | navigate the maze. Once the plan is learned it can be re-used any
       | number of times to solve the maze without any more searching.
       | 
       | The interesting thing is that, if the maze changes, the program
       | only needs to replace _part_ of its learned plan. Shannon
       | demonstrates this by moving around some of the walls of the maze,
       | which are modular segments. Theseus initially enters its
       | searching routine, until it finds its way to the part of the maze
       | that it knows already. After that, it 's smooth sailing,
       | following the plan it already knows.
       | 
       | I have no idea what that technique is that does that. If I
       | guessed, I'd guess that the program has a model of its world -a
       | map of the maze- or, more interestingly, it builds one as it goes
       | along. When the maze changes, the program updates its model, but
       | it is able to update only as much of the model as corresponds to
       | the part of the world that has really changed. This suggests some
       | kind of object permanence assumption, an encoding of the
       | knowledge that if the state of some entity in the world is not
       | known to have changed, then it has not changed. Or, in other
       | words, some solution to the frame problem (which was actually
       | described about a decade later than Shannon's demonstration).
       | 
       | Note well that modern AI systems very rarely exhibit this
       | ability. For instance, deep neural networks _cannot update their
       | models_ - not partly, not wholesale. Like I say, I have no idea
       | what the approach is that is demonstrated. It may be something
       | well-known but rarely used these days. It may be something that
       | Shannon came up with before John McCarthy fathered AI and so it
       | never really caught on as an AI approach. If anyone knows, please
       | do tell, I 'm very curious.
       | 
       | In any case "Theseus" (rather, its AI) seems to exhibit what
       | McCarthy called "elaboration tolerance", making formal
       | representations of facts that can be modified easily:
       | 
       | http://jmc.stanford.edu/articles/elaboration.html
        
       ___________________________________________________________________
       (page generated 2021-01-26 23:00 UTC)