[HN Gopher] Thinking Machines - Introduction to Data Parallel Su...
___________________________________________________________________
Thinking Machines - Introduction to Data Parallel Supercomputing
(1989)
Author : gone35
Score : 49 points
Date : 2022-05-15 10:01 UTC (2 days ago)
(HTM) web link (www.youtube.com)
(TXT) w3m dump (www.youtube.com)
| dragontamer wrote:
| I haven't seen this video yet, but I should note that "Thinking
| Machine's CM2" processor was a 4096x 1-bit SIMD processor.
|
| Which was enough to really have CUDA-like or OpenCL-like code
| back in the day. Operations could be compiled into 1-bit commands
| to be executed in parallel across 4096 "SIMD-lanes / threads"
| (much akin to CUDA-threads).
|
| A lot of the research from the CM2 / Thinking Machines was
| translated into modern GPU code (parallel prefix sum, radix sort,
| etc. etc.). The research done back then really lays the
| foundation upon today's embarassingly parallel works.
|
| There were some other computers that came before CM2 of course.
| But CM2 + Thinking Machines is very clearly part of the great
| history of SIMD-compute / PRAM model of compute / Parallel
| vectors / etc. etc.
| chris_st wrote:
| Minor clarification -- the CM1 was the 4096-processor machine.
|
| The CM2 had (well, those delivered to customers, anyway) a
| minimum of 8,192 processors, and up to a maximum 65,535. I
| worked on one for a number of years. I heard (but never saw)
| that for internal development at Connection Machines each
| developer had a single 512-processor board to work on.
|
| You're spot on about the single-bit processor thing though. In
| their *Lisp implementation, you could declare an integer to be
| as many bits as you wanted, up to the max 128k bits each
| processor had.
| rst wrote:
| The CM-2 also had hardware floating-point units (off-the-
| shelf chips from Weitek, paired with groups of 32 bit-serial
| processors, via custom chips with "transposers", which would
| transpose 32-bit floats read bit-serially out of 32 bit-
| processors memories, into 32-bit floats presented in parallel
| to the FPU chips).
| rbanffy wrote:
| Is there a *Lisp compiler that generates code to run on
| current computers?
| dragontamer wrote:
| I don't think so. I'd imagine that stuff was lost to the
| ages.
|
| Today's compilers that generate a similar set of code are
| OpenCL, CUDA, DirectX / HLSL, Opengl's GLSL, Apple Metal,
| AMD HIP, and Intel ISPC.
|
| -----------
|
| Today's computers (or really, GPUs), aren't 4096x wide
| devices or 65536x wide devices. GPUs are 32-wide or 64-wide
| natively, and then MIMD'd into parallel parts after that.
| (AKA: CUDA has the Grid -> Block -> Thread model. I'm
| pretty sure that Star-Lisp was only Grid -> Thread, with no
| "intermediate" block in between).
|
| ----------
|
| You probably can get a similar effect as the original CM-2
| by compiling into AND/OR/XOR and shift-instructions for
| AVX512 though?
|
| Or maybe AND/OR/XOR + shift instructions on AMD's CDNA2+
| processor (64x wide and 64-bit, for 4096-wide SIMD per
| core). It'd be pretty terrible, all else considered,
| because modern GPUs have a slight MIMD factor there.
| rbanffy wrote:
| Wouldn't have to generate code for current SIMD units,
| just being able to compile *Lisp code and run it on a
| common computer would make me very happy.
| cleerline wrote:
| 1 bit processor and 1 bit commands? is this a typo?
| dragontamer wrote:
| It was a 1-bit processor. But "1 bit commands" is probably a
| mistake on my part. Perhaps it'd be more accurate to say
| "commands over 1-bit registers", or something along those
| lines.
| jkuria wrote:
| Anything CM related always reminds me of this Richard Feynman
| story:
|
| https://longnow.org/essays/richard-feynman-connection-machin...
| nickt wrote:
| You can get a CM-1 t-shirt from Tamiko Thiel [1], and a little
| background [2].
|
| [1] https://mission-base-creations.myspreadshop.com/ [2]
| https://www.mission-base.com/tamiko/cm/cm-tshirt.html
| eismcc wrote:
| Having spoken to someone who was there (Steve Omohundro who wrote
| StarLisp), the lack of floating point and extreme parallelism
| were actually a pain because everything had to be written from
| scratch. Steve gave me an example that they wrote over 200
| algorithms in a new way, including parallel regex (as I recall).
| So while amazing hardware and ideas, there was a sense that it
| was along road to purse the actual goal of AI research.
| chris_st wrote:
| One of their mottos was, "Building a machine that will be proud
| of us."
| rbanffy wrote:
| I remember reading that as a kid and how much it moved and
| inspired me.
|
| We'll get there eventually.
___________________________________________________________________
(page generated 2022-05-17 23:01 UTC)