[HN Gopher] The future of deep learning is photonic
___________________________________________________________________
The future of deep learning is photonic
Author : pcaversaccio
Score : 49 points
Date : 2021-07-31 09:36 UTC (1 days ago)
(HTM) web link (spectrum.ieee.org)
(TXT) w3m dump (spectrum.ieee.org)
| dogma1138 wrote:
| Lightmatter is apparently shipping their photonic accelerator to
| early adopters right now https://lightmatter.co/products/envise/
|
| They claim quite big performance uplift over current GPUs and
| even better performance per watt figures.
|
| However I haven't seen independent benchmarks for their Envise
| ASIC yet.
|
| They seem to be using a different method with an external laser
| to "power" the chips using fiber optic cables rather than
| integrating solid state lasers into the silicon. It's also not
| based on LCD technology like early attempts at photonic computing
| which was essentially a sandwich of LCD screens and solid state
| detectors and amplifiers.
| TaylorAlexander wrote:
| Yep and the CEO claimed 10x faster than current tech while
| using 10% the electrical power. If it's really 100x more
| efficient that is huge. But it's the CEO saying it so hard to
| say how true it is.
|
| https://youtu.be/t1R7ElXEyag
| dogma1138 wrote:
| It's an interesting tech however the big issue with photonics
| is that building complex logic and memory is hard to near
| impossible it's quite good at doing relatively basic
| operations at scale which is why the tech can work with as
| simple components as a bunch of LCD screens that hold your
| input values as a pixel mask and a detector in the end, since
| the LCD matrix can be quite big say 1024x1024 pixels you are
| able to do basic but wise ops like XOR on huge matrices the
| challenge was always to A) get photonics to be fast enough
| where their scale can outpace the switching frequency of
| traditional semiconductors and B) get to a point where you
| can actually do useful operations.
|
| My gut feeling is that Envise can actually do only a few
| basic operations in photonics, however if these operations
| are sufficiently tailored for specific tasks in ML that might
| be good enough.
|
| It's not that different than what NVIDIA did with their
| Tensor cores they aren't general purpose ALUs they only do a
| few things but they are very fast at those tasks.
| TaylorAlexander wrote:
| The CEO of Lightmatter says their chip only does a matrix
| vector multiply, which he says is a core operation in deep
| learning. He also says photonics is not good for normal
| logic operations.
|
| But that's fine, because I am mostly concerned with
| accelerating deep learning. I'm a robotics engineer and
| when I look at large neural networks like GPT-3 I get the
| sense that robotics could work well with very massive
| networks, even orders of magnitude larger than GPT-3
| (imagine not just ingesting text and producing a stream of
| words, but encoding a multidimensional world state for a
| robot and producing a desired action based on all current
| and past signals).
|
| But to put massive neural networks orders of magnitude
| larger than GPT-3 in to a robot requires a significant step
| change in the efficiency and scale of neural network
| compute.
|
| So I don't mind if their chip doesn't do standard logic
| well because a regular intel chip is great at that. I just
| want to see significantly more powerful neural network
| compute. And if the Lightmatter CEO is to be believed (I
| don't know), their tech could be a boon for machine
| learning and robotics some day.
| mmmBacon wrote:
| Photonics doesn't really scale well for this application
| due to relatively large area requirements and the fact that
| array size is limited. It's really better suited to helping
| with data movement than for computation.
| civilized wrote:
| People have been talking about photonics for decades. Is
| photonics _actually_ well-suited to deep learning? Or is this
| just another "new thing is happening, time to remind people of
| the old thing that keeps never happening" take?
| dasudasu wrote:
| Nobody was talking seriously about photonic accelerators 10
| years ago. Optical computing last had a hype cycle in the late
| 80s/early 90s.
| Havoc wrote:
| Can we make fiber optics cables smaller than 2nm?
|
| Or is the game plan to make them bigger & nobody cares cause no
| heat & fast?
| adrusi wrote:
| Not even remotely an expert on chip design, but deep learning
| dataflow is a lot more predictable and linear than what a CPU
| or even a GPU doing actual graphics needs to do. Speed of light
| latency is probably not an issue since the relevant distances
| are not one side of the die to the other but rather the
| distance between adjacent components.
| jhgb wrote:
| How could you make a 2nm optics cable? The wavelengths are two
| orders of magnitude larger than that.
| dasudasu wrote:
| Photonic integrated circuits are almost as old as regular
| integrated circuits.
| TaylorAlexander wrote:
| I don't think the size of silicon transistors has anything to
| do with this. There is no direct comparison. So they will prob
| bigger but no one cares.
| varelse wrote:
| By the time any of this is viable commercially, I really wonder
| where commercial GPUs and accelerators will be.
|
| And they're going to have the home team advantage when that
| happens. So that means that unless these things are as accessible
| to tensorflow/pytorch or whatever the heck the framework de jour
| is at that point (hopefully more like Jax but better), they will
| get no traction.
|
| Evidence? Every single demonstrably superior CPU architecture
| that failed to dislodge x86 over the past 50 years. Sure, it's
| finally happening, but it's also 50 years later.
| Salgat wrote:
| We've already seen disruptive architectures like Google's
| Tensor Processing Units. x86 has already been upended for ML,
| and photonics based processing units will simply be another PCI
| card you plug into your computer just like a GPU.
| AbrahamParangi wrote:
| How disruptive are TPUs really though? My understanding is
| that essentially everything is still trained on Nvidia
| architecture.
| varelse wrote:
| If you design your network on a TPU, you will tend to use
| operators that work well on a TPU. And in the end you will
| have a network that works best on the TPU.
|
| Lather rinse repeat for any other architecture. You can
| even make a network that runs best on Graphcore that way,
| but it won't be fun to do it. You might even get Graphcore
| to pay top dollar for it though as they both need some good
| publicity and they have lots of VC left to squander.
|
| This also tends to be true of video games where the
| platform on which they were developed is the best place to
| play them rather than their many ports.
| mmmBacon wrote:
| We've been doing this for years with DSP and networking.
| So kind of ho-hum from a HW perspective.
|
| If you ask me the thing that makes these things even
| remotely interesting is the willingness from the SW side
| to support new HW architectures. Without that you can't
| have any innovation in HW.
| jorgBaller wrote:
| Hint: Look at the performance delta between the last 3
| generatons of nvidia gpus, keeping the TDP fixed. Current
| accelerators have nice tricks like reducing precision, adding
| more tensor multipliers etc but thats it. ALl they can do afte
| r these optimisations is scaling down the feature sizes which
| will eventually be limited by physics in a few generations.
| Then what? A fundamentally diferent approach is required
| kadoban wrote:
| Better CPU arches were never faster, definitely not by much.
| varelse wrote:
| Going to disagree. I ran circles around contemporary x86 back
| in the early '90s because of specific instructions Intel
| denied they would ever need in their processor roadmap. But
| it really didn't matter and that's one of the most important
| lessons of my career.
|
| They did similar goofy thinking with respect to the magic
| transcendental unit on GPUs so it's not like they ever
| learned. It's not entirely about clock rate.
| etaioinshrdlu wrote:
| Luckily chip companies know this now and all the alternative
| deep learning accelerators have some level of support for
| mainstream frameworks. Getting their support upstreamed to the
| main framework project is another matter, though, as it
| reaching the quality of implementation as CUDA and CPU...
___________________________________________________________________
(page generated 2021-08-01 23:00 UTC)