[HN Gopher] Class-specific diffractive cameras with all-optical ...
___________________________________________________________________
Class-specific diffractive cameras with all-optical erasure of
undesired objects
Author : rntn
Score : 40 points
Date : 2022-08-15 12:59 UTC (1 days ago)
(HTM) web link (elight.springeropen.com)
(TXT) w3m dump (elight.springeropen.com)
| rush-mindwork wrote:
| Greg_hamel wrote:
| This is pure physics gold.
|
| And the fact that the paper is available for free is just added
| gravy!
| elil17 wrote:
| If you think about it in the abstract, it's not that weird. Okay,
| you're computing a function with light using some diffraction
| gradients.
|
| The outcome, though, is mind-boggling: a camera that can only
| take pictures of the number two, no other numbers. Totally
| magical!
| api wrote:
| Can you compute a neural network this way? Or do other forms of
| useful computation?
| Enginerrrd wrote:
| If I understand correctly that's sort of exactly what this
| is. The geometry of the diffraction gratings encodes a
| forward propagation model trained as classifier of the number
| "2".
|
| I don't quite understand the mathematics of how it was
| trained, but they were able to discretize the geometry of
| those layers somehow into little 0.4mm pixels of "trainable
| diffractive neurons" and they simulated light transmission
| through the layers to compute a loss function.
|
| I'm really surprised that this was computationally feasible.
| Simulation of light through the gratings must have been cheap
| enough as a function evaluation to train the network.
| Lramseyer wrote:
| I would imagine that you generate the desired transform
| function of a diffractive structure rather than the
| structure itself, because the structure is ultimately
| derived from the transform function. Since the transform
| function is basically a 2D fourier transform and a spatial
| frequency/phase plot, it's not _that_ computationally
| costly. Once you settle on functions you like, you then
| generate and or simulate a diffractive structure and see if
| it behaves how you expect.
| Lramseyer wrote:
| Sort of, I didn't dive too far into the math, but it looks
| like each diffractive structure is akin a layer of a neural
| net, which is tuned for a set of spatial frequencies and
| phases, which when combined (like layers of a neural net) to
| form recognition of more complex objects.
|
| There are a few gotchas in that statement though - for one, I
| didn't dive too far into the math, and I would assume that
| the convolutional algorithms as well as the underlying matrix
| functions may be different. But at the end of the day, you're
| approximating a complex function using an array of simple
| functions with different weights and scale factors. The other
| gotcha is that diffractive structures use monochromatic
| light, so it's probably not too useful in most normal
| situations with normal light sources.
| SiempreViernes wrote:
| It can do the same sort of computation work any stack of
| analogue filters can do: it does _one thing_ very fast and if
| you want something else done you must create those filters
| first and the frame holding the stack is of no help at all.
| stavros wrote:
| > Okay, you're computing a function with light using some
| diffraction gradients.
|
| Our definitions of "not that weird" are very different.
| sbaiddn wrote:
| TL/DR, but far field diffraction is the Fourier transform of
| the aperture (the math is straightforward enough, an integral
| of an exp).
|
| It blew my mind when I did in school, yet... there was the
| proof that it worked!
| SiempreViernes wrote:
| > a camera that can only take pictures of the number two, no
| other numbers.
|
| Well, to be precise it makes a complete (deterministic) mess of
| any other numbers. But given the output and the filters you can
| probably unfold the camera "psf" and get back whatever it was
| it saw.
| dplavery92 wrote:
| From the parent article:
|
| >Importantly, this diffractive camera is not based on a
| standard point-spread function-based pixel-to-pixel mapping
| between the input and output FOVs, and therefore, it does not
| automatically result in signals within the output FOV for the
| transmitting input pixels that statistically overlap with the
| objects from the target data class. For example, the
| handwritten digits '3' and '8' in Fig. 2c were completely
| erased at the output FOV, regardless of the considerable
| amount of common (transmitting) pixels that they
| statistically share with the handwritten digit '2'. Instead
| of developing a spatially-invariant point-spread function,
| our designed diffractive camera statistically learned the
| characteristic optical modes possessed by different training
| examples, to converge as an optical mode filter, where the
| main modes that represent the target class of objects can
| pass through with minimum distortion of their relative phase
| and amplitude profiles, whereas the spatial information
| carried by the characteristic optical modes of the other data
| classes were scattered out.
|
| It seems like that may not be so possible.
|
| Later on in the article:
|
| >It is important to emphasize that the presented diffractive
| camera system does not possess a traditional, spatially-
| invariant point-spread function. A trained diffractive camera
| system performs a learned, complex-valued linear
| transformation between the input and output fields that
| statistically represents the coherent imaging of the input
| objects from the target data class.
|
| Note here that the learned transformations are linear, and
| the Fourier Transform is linear, but you cannot invert from
| output to input because the sensor measures real-valued
| intensities of complex-valued fields. All the phase
| information is lost.
___________________________________________________________________
(page generated 2022-08-16 23:00 UTC)