[HN Gopher] Thinking Machines - Introduction to Data Parallel Su...
       ___________________________________________________________________
        
       Thinking Machines - Introduction to Data Parallel Supercomputing
       (1989)
        
       Author : gone35
       Score  : 49 points
       Date   : 2022-05-15 10:01 UTC (2 days ago)
        
 (HTM) web link (www.youtube.com)
 (TXT) w3m dump (www.youtube.com)
        
       | dragontamer wrote:
       | I haven't seen this video yet, but I should note that "Thinking
       | Machine's CM2" processor was a 4096x 1-bit SIMD processor.
       | 
       | Which was enough to really have CUDA-like or OpenCL-like code
       | back in the day. Operations could be compiled into 1-bit commands
       | to be executed in parallel across 4096 "SIMD-lanes / threads"
       | (much akin to CUDA-threads).
       | 
       | A lot of the research from the CM2 / Thinking Machines was
       | translated into modern GPU code (parallel prefix sum, radix sort,
       | etc. etc.). The research done back then really lays the
       | foundation upon today's embarassingly parallel works.
       | 
       | There were some other computers that came before CM2 of course.
       | But CM2 + Thinking Machines is very clearly part of the great
       | history of SIMD-compute / PRAM model of compute / Parallel
       | vectors / etc. etc.
        
         | chris_st wrote:
         | Minor clarification -- the CM1 was the 4096-processor machine.
         | 
         | The CM2 had (well, those delivered to customers, anyway) a
         | minimum of 8,192 processors, and up to a maximum 65,535. I
         | worked on one for a number of years. I heard (but never saw)
         | that for internal development at Connection Machines each
         | developer had a single 512-processor board to work on.
         | 
         | You're spot on about the single-bit processor thing though. In
         | their *Lisp implementation, you could declare an integer to be
         | as many bits as you wanted, up to the max 128k bits each
         | processor had.
        
           | rst wrote:
           | The CM-2 also had hardware floating-point units (off-the-
           | shelf chips from Weitek, paired with groups of 32 bit-serial
           | processors, via custom chips with "transposers", which would
           | transpose 32-bit floats read bit-serially out of 32 bit-
           | processors memories, into 32-bit floats presented in parallel
           | to the FPU chips).
        
           | rbanffy wrote:
           | Is there a *Lisp compiler that generates code to run on
           | current computers?
        
             | dragontamer wrote:
             | I don't think so. I'd imagine that stuff was lost to the
             | ages.
             | 
             | Today's compilers that generate a similar set of code are
             | OpenCL, CUDA, DirectX / HLSL, Opengl's GLSL, Apple Metal,
             | AMD HIP, and Intel ISPC.
             | 
             | -----------
             | 
             | Today's computers (or really, GPUs), aren't 4096x wide
             | devices or 65536x wide devices. GPUs are 32-wide or 64-wide
             | natively, and then MIMD'd into parallel parts after that.
             | (AKA: CUDA has the Grid -> Block -> Thread model. I'm
             | pretty sure that Star-Lisp was only Grid -> Thread, with no
             | "intermediate" block in between).
             | 
             | ----------
             | 
             | You probably can get a similar effect as the original CM-2
             | by compiling into AND/OR/XOR and shift-instructions for
             | AVX512 though?
             | 
             | Or maybe AND/OR/XOR + shift instructions on AMD's CDNA2+
             | processor (64x wide and 64-bit, for 4096-wide SIMD per
             | core). It'd be pretty terrible, all else considered,
             | because modern GPUs have a slight MIMD factor there.
        
               | rbanffy wrote:
               | Wouldn't have to generate code for current SIMD units,
               | just being able to compile *Lisp code and run it on a
               | common computer would make me very happy.
        
         | cleerline wrote:
         | 1 bit processor and 1 bit commands? is this a typo?
        
           | dragontamer wrote:
           | It was a 1-bit processor. But "1 bit commands" is probably a
           | mistake on my part. Perhaps it'd be more accurate to say
           | "commands over 1-bit registers", or something along those
           | lines.
        
       | jkuria wrote:
       | Anything CM related always reminds me of this Richard Feynman
       | story:
       | 
       | https://longnow.org/essays/richard-feynman-connection-machin...
        
       | nickt wrote:
       | You can get a CM-1 t-shirt from Tamiko Thiel [1], and a little
       | background [2].
       | 
       | [1] https://mission-base-creations.myspreadshop.com/ [2]
       | https://www.mission-base.com/tamiko/cm/cm-tshirt.html
        
       | eismcc wrote:
       | Having spoken to someone who was there (Steve Omohundro who wrote
       | StarLisp), the lack of floating point and extreme parallelism
       | were actually a pain because everything had to be written from
       | scratch. Steve gave me an example that they wrote over 200
       | algorithms in a new way, including parallel regex (as I recall).
       | So while amazing hardware and ideas, there was a sense that it
       | was along road to purse the actual goal of AI research.
        
         | chris_st wrote:
         | One of their mottos was, "Building a machine that will be proud
         | of us."
        
           | rbanffy wrote:
           | I remember reading that as a kid and how much it moved and
           | inspired me.
           | 
           | We'll get there eventually.
        
       ___________________________________________________________________
       (page generated 2022-05-17 23:01 UTC)