[HN Gopher] The future of deep learning is photonic
       ___________________________________________________________________
        
       The future of deep learning is photonic
        
       Author : pcaversaccio
       Score  : 49 points
       Date   : 2021-07-31 09:36 UTC (1 days ago)
        
 (HTM) web link (spectrum.ieee.org)
 (TXT) w3m dump (spectrum.ieee.org)
        
       | dogma1138 wrote:
       | Lightmatter is apparently shipping their photonic accelerator to
       | early adopters right now https://lightmatter.co/products/envise/
       | 
       | They claim quite big performance uplift over current GPUs and
       | even better performance per watt figures.
       | 
       | However I haven't seen independent benchmarks for their Envise
       | ASIC yet.
       | 
       | They seem to be using a different method with an external laser
       | to "power" the chips using fiber optic cables rather than
       | integrating solid state lasers into the silicon. It's also not
       | based on LCD technology like early attempts at photonic computing
       | which was essentially a sandwich of LCD screens and solid state
       | detectors and amplifiers.
        
         | TaylorAlexander wrote:
         | Yep and the CEO claimed 10x faster than current tech while
         | using 10% the electrical power. If it's really 100x more
         | efficient that is huge. But it's the CEO saying it so hard to
         | say how true it is.
         | 
         | https://youtu.be/t1R7ElXEyag
        
           | dogma1138 wrote:
           | It's an interesting tech however the big issue with photonics
           | is that building complex logic and memory is hard to near
           | impossible it's quite good at doing relatively basic
           | operations at scale which is why the tech can work with as
           | simple components as a bunch of LCD screens that hold your
           | input values as a pixel mask and a detector in the end, since
           | the LCD matrix can be quite big say 1024x1024 pixels you are
           | able to do basic but wise ops like XOR on huge matrices the
           | challenge was always to A) get photonics to be fast enough
           | where their scale can outpace the switching frequency of
           | traditional semiconductors and B) get to a point where you
           | can actually do useful operations.
           | 
           | My gut feeling is that Envise can actually do only a few
           | basic operations in photonics, however if these operations
           | are sufficiently tailored for specific tasks in ML that might
           | be good enough.
           | 
           | It's not that different than what NVIDIA did with their
           | Tensor cores they aren't general purpose ALUs they only do a
           | few things but they are very fast at those tasks.
        
             | TaylorAlexander wrote:
             | The CEO of Lightmatter says their chip only does a matrix
             | vector multiply, which he says is a core operation in deep
             | learning. He also says photonics is not good for normal
             | logic operations.
             | 
             | But that's fine, because I am mostly concerned with
             | accelerating deep learning. I'm a robotics engineer and
             | when I look at large neural networks like GPT-3 I get the
             | sense that robotics could work well with very massive
             | networks, even orders of magnitude larger than GPT-3
             | (imagine not just ingesting text and producing a stream of
             | words, but encoding a multidimensional world state for a
             | robot and producing a desired action based on all current
             | and past signals).
             | 
             | But to put massive neural networks orders of magnitude
             | larger than GPT-3 in to a robot requires a significant step
             | change in the efficiency and scale of neural network
             | compute.
             | 
             | So I don't mind if their chip doesn't do standard logic
             | well because a regular intel chip is great at that. I just
             | want to see significantly more powerful neural network
             | compute. And if the Lightmatter CEO is to be believed (I
             | don't know), their tech could be a boon for machine
             | learning and robotics some day.
        
             | mmmBacon wrote:
             | Photonics doesn't really scale well for this application
             | due to relatively large area requirements and the fact that
             | array size is limited. It's really better suited to helping
             | with data movement than for computation.
        
       | civilized wrote:
       | People have been talking about photonics for decades. Is
       | photonics _actually_ well-suited to deep learning? Or is this
       | just another  "new thing is happening, time to remind people of
       | the old thing that keeps never happening" take?
        
         | dasudasu wrote:
         | Nobody was talking seriously about photonic accelerators 10
         | years ago. Optical computing last had a hype cycle in the late
         | 80s/early 90s.
        
       | Havoc wrote:
       | Can we make fiber optics cables smaller than 2nm?
       | 
       | Or is the game plan to make them bigger & nobody cares cause no
       | heat & fast?
        
         | adrusi wrote:
         | Not even remotely an expert on chip design, but deep learning
         | dataflow is a lot more predictable and linear than what a CPU
         | or even a GPU doing actual graphics needs to do. Speed of light
         | latency is probably not an issue since the relevant distances
         | are not one side of the die to the other but rather the
         | distance between adjacent components.
        
         | jhgb wrote:
         | How could you make a 2nm optics cable? The wavelengths are two
         | orders of magnitude larger than that.
        
         | dasudasu wrote:
         | Photonic integrated circuits are almost as old as regular
         | integrated circuits.
        
         | TaylorAlexander wrote:
         | I don't think the size of silicon transistors has anything to
         | do with this. There is no direct comparison. So they will prob
         | bigger but no one cares.
        
       | varelse wrote:
       | By the time any of this is viable commercially, I really wonder
       | where commercial GPUs and accelerators will be.
       | 
       | And they're going to have the home team advantage when that
       | happens. So that means that unless these things are as accessible
       | to tensorflow/pytorch or whatever the heck the framework de jour
       | is at that point (hopefully more like Jax but better), they will
       | get no traction.
       | 
       | Evidence? Every single demonstrably superior CPU architecture
       | that failed to dislodge x86 over the past 50 years. Sure, it's
       | finally happening, but it's also 50 years later.
        
         | Salgat wrote:
         | We've already seen disruptive architectures like Google's
         | Tensor Processing Units. x86 has already been upended for ML,
         | and photonics based processing units will simply be another PCI
         | card you plug into your computer just like a GPU.
        
           | AbrahamParangi wrote:
           | How disruptive are TPUs really though? My understanding is
           | that essentially everything is still trained on Nvidia
           | architecture.
        
             | varelse wrote:
             | If you design your network on a TPU, you will tend to use
             | operators that work well on a TPU. And in the end you will
             | have a network that works best on the TPU.
             | 
             | Lather rinse repeat for any other architecture. You can
             | even make a network that runs best on Graphcore that way,
             | but it won't be fun to do it. You might even get Graphcore
             | to pay top dollar for it though as they both need some good
             | publicity and they have lots of VC left to squander.
             | 
             | This also tends to be true of video games where the
             | platform on which they were developed is the best place to
             | play them rather than their many ports.
        
               | mmmBacon wrote:
               | We've been doing this for years with DSP and networking.
               | So kind of ho-hum from a HW perspective.
               | 
               | If you ask me the thing that makes these things even
               | remotely interesting is the willingness from the SW side
               | to support new HW architectures. Without that you can't
               | have any innovation in HW.
        
         | jorgBaller wrote:
         | Hint: Look at the performance delta between the last 3
         | generatons of nvidia gpus, keeping the TDP fixed. Current
         | accelerators have nice tricks like reducing precision, adding
         | more tensor multipliers etc but thats it. ALl they can do afte
         | r these optimisations is scaling down the feature sizes which
         | will eventually be limited by physics in a few generations.
         | Then what? A fundamentally diferent approach is required
        
         | kadoban wrote:
         | Better CPU arches were never faster, definitely not by much.
        
           | varelse wrote:
           | Going to disagree. I ran circles around contemporary x86 back
           | in the early '90s because of specific instructions Intel
           | denied they would ever need in their processor roadmap. But
           | it really didn't matter and that's one of the most important
           | lessons of my career.
           | 
           | They did similar goofy thinking with respect to the magic
           | transcendental unit on GPUs so it's not like they ever
           | learned. It's not entirely about clock rate.
        
         | etaioinshrdlu wrote:
         | Luckily chip companies know this now and all the alternative
         | deep learning accelerators have some level of support for
         | mainstream frameworks. Getting their support upstreamed to the
         | main framework project is another matter, though, as it
         | reaching the quality of implementation as CUDA and CPU...
        
       ___________________________________________________________________
       (page generated 2021-08-01 23:00 UTC)