[HN Gopher] Cornell and NTT's Physical Neural Nets Enable Arbitr...
___________________________________________________________________
Cornell and NTT's Physical Neural Nets Enable Arbitrary Physical
System Training
Author : rch
Score : 18 points
Date : 2021-05-29 13:35 UTC (1 days ago)
(HTM) web link (syncedreview.com)
(TXT) w3m dump (syncedreview.com)
| rich_sasha wrote:
| Hard to deduce much from the article. Is it that it's a NN where
| the individual components are physical transforms?
|
| > On the MNIST handwritten digit classification task, the
| trainable SHG transformations boost the performance of digital
| operations from roughly 90 percent accuracy to 97 percent.
|
| It is hard to take it seriously, when 97% on MNIST is achievable
| with the kind of tutorials bundled at the end of PyTorch
| installation guides - "see you can make a DNN model in 10 lines
| of code!".
| rch wrote:
| The paper is linked at the bottom of the article: _Deep
| physical neural networks enabled by a backpropagation algorithm
| for arbitrary physical systems_ --
| https://arxiv.org/abs/2104.13386
|
| > physics-aware training (PAT)... allows us to efficiently and
| accurately execute the backpropagation algorithm on any
| sequence of physical input-output transformations, directly _in
| situ_.
| Animats wrote:
| What they're doing, I think, is compiling a trained neural net
| into a different form.
|
| (1), training input data (e.g., an image) is input to the
| physical system, along with parameters.
|
| (2), in the forward pass, the physical system applies its
| transformation to produce an output.
|
| (3), the physical output is compared to the intended output
| (e.g., for an image of an '8', a predicted label of 8) to compute
| the error.
|
| (4), using a differentiable digital model to estimate the
| gradients of the physical system(s), the gradient of the loss is
| computed with respect to the controllable parameters.
|
| (5) the parameters are updated according to the inferred
| gradient.
|
| What they mean by a "physical system" is a series of analog
| elements with lots of tuning parameters. Like filters. This is a
| system for setting the tuning parameters. You have to be able to
| simulate the "physical system", and it has to be mostly
| differentiable, so you can tune by hill-climbing.
|
| The control systems people ought to like this, because the output
| is a control system that's made of components with predictable
| and continuous properties. You want to know that if if does the
| right thing for an input of 1.0 and 1.5, it doesn't do something
| totally unexpected for 1.365. This may be a way to get there.
|
| This may be the mechanism behind "muscle memory". Tasks get
| optimized down to a control system that executes fast, but
| doesn't retrain easily.
|
| The problems they chose to work on seem strange, but that may
| reflect their funding or interests. This might be worth trying
| for, say, quadcopter control. You might be able to train a neural
| net controller and then hammer it down into a quick little
| algorithm that can fit in the onboard computer.
|
| (I subscribe to IEEE Control Systems Journal, and I understand
| maybe 15% of it.)
| [deleted]
| teruakohatu wrote:
| The hybrid approach of calculating loss with a physical system
| and then calculating the gradient using a model narrows the
| simulation-reality gap. The cost would surley be much, much
| slower training. If the physical process required, for example,
| heating and cooling an oven, training would take a very long
| time.
| [deleted]
___________________________________________________________________
(page generated 2021-05-30 23:01 UTC)