[HN Gopher] M1076 Analog Matrix Processor
       ___________________________________________________________________
        
       M1076 Analog Matrix Processor
        
       Author : tosh
       Score  : 78 points
       Date   : 2023-11-15 15:22 UTC (7 hours ago)
        
 (HTM) web link (mythic.ai)
 (TXT) w3m dump (mythic.ai)
        
       | liquidify wrote:
       | What makes it analog?
        
         | ganzuul wrote:
         | > In practice it is possible, in a flash memory made on a
         | mature process such as 40nm, to store reliably a range of
         | charges that correspond to a digital resolution of 8 bits.
         | 
         | - https://mythic.ai/wp-
         | content/uploads/2022/02/MythicWhitepape...
        
           | strbean wrote:
           | Another interesting excerpt:
           | 
           | > When a charge is programmed into a flash memory device, its
           | electric field has an effect on any signal passing through
           | it. In the Mythic architecture, the flash transistor acts as
           | a variable resistor that reduces the signal level passing to
           | the output. That reduction is proportional to the analog
           | value stored in the memory. This simple effect implements the
           | multiplication stage found in DNN calculations. The
           | accumulation process, in which the output from each of those
           | calculations is summed, is handled by aggregating the output
           | of an entire column of memory cells. Thanks to these two
           | properties, the Mythic architecture can process an entire
           | input vector in a single step rather than iterating at high
           | speed as in a digital processor
        
       | teruakohatu wrote:
       | If you are wondering about the analog part:
       | 
       | > Analog computing provides the ultimate compute-in-memory
       | processing element. The term compute-in-memory is used very
       | broadly and can mean many things. Our analog compute takes
       | compute-in-memory to an extreme, where we compute directly inside
       | the memory array itself. This is possible by using the memory
       | elements as tunable resistors, supplying the inputs as voltages,
       | and collecting the outputs as currents. We use analog computing
       | for our core neural network matrix operations, where we are
       | multiplying an input vector by a weight matrix.
       | 
       | > Analog computing provides several key advantages. First, it is
       | amazingly efficient; it eliminates memory movement for the neural
       | network weights since they are used in place as resistors.
       | Second, it is high performance; there are hundreds of thousands
       | of multiply-accumulate operations occurring in parallel when we
       | perform one of these vector operations. Given these two
       | properties, analog computing is the core of our high-performance
       | yet highly-efficient system.
       | 
       | https://mythic.ai/technology/analog-computing/
        
         | nabla9 wrote:
         | I suspect that this is where all inference processors are
         | heading eventually. The benefits are just too great when exact
         | computation is not required.
         | 
         | Training might be harder to implement.
        
           | pclmulqdq wrote:
           | This is a very specific technology that has serious scaling
           | issues, and the neural networks coming out today are huge in
           | comparison to YOLOv5 and ResNet. The company has already
           | failed once. This will probably have its niche in some
           | computer vision stuff for a little while, but models have
           | already outgrown it.
        
             | nabla9 wrote:
             | This particular implementation of the idea of analog matrix
             | processor implemented at low level may fail, but some
             | different implementation will succeed.
        
               | pclmulqdq wrote:
               | There are some fundamental limits to analog computing
               | that start to really hurt at the small nodes, which you
               | need to scale up. There's a very good reason they are
               | stuck at 40 nm.
        
             | mwbajor wrote:
             | I have an analog circuits and a bit of an analog computing
             | background and I will tell you this: analog computers have
             | been around for awhile in commercial applications, still to
             | this day theyre used to great success in some niche
             | applications. They're limited by their flexibility as they
             | are not really reprogrammable. Analog circuits need to be
             | designed for each specific application. There are papers
             | and prototypes built to create "reconfigurable" analog
             | versions of FPGAs but they are limited by physical
             | scalability issues, noise, routing, etc.
        
         | inasio wrote:
         | It's basically Ohm's law V = RI, or in this case I = CV
         | (conductance times voltage), on a lattice of wires. On many
         | analog devices setting the values (write) can be very time
         | consuming, although the read operation can indeed be fast and
         | super low energy. Unclear if this is the case here.
        
       | dumbo-octopus wrote:
       | Veritasium did a decent explainer/interview with this company,
       | about this product. https://www.youtube.com/watch?v=GVsUOuSjvcg
        
       | moffkalast wrote:
       | I don't suppose these will be retail purchasable any time soon?
       | Would be a useful thing to plug into the Pi 5's new PCIe port,
       | not unlike a NCS2. Although with probably very spotty software
       | support...
        
       | ThinkBeat wrote:
       | Hmm
       | 
       | This chip is said to perform 25 TOPS. A A100 80GB SXM is said to
       | perform 1248 TOPS. Which is an entire card not a single CPU.
       | 
       | You could theoretically achieve the same with 50 M1076
       | processors.
       | 
       | Would the benefit be any of:
       | 
       | * a lot cheaper * more energy efficient. * small size * easier to
       | mass produce?
        
         | blovescoffee wrote:
         | But for small / embedded models, you'd much much prefer to use
         | this chip from mythic than a huge (and possibly more expensive)
         | a100
        
       ___________________________________________________________________
       (page generated 2023-11-15 23:00 UTC)