[HN Gopher] Reverse engineering a neural network's clever soluti...
___________________________________________________________________
Reverse engineering a neural network's clever solution to binary
addition (2023)
Author : Ameo
Score : 70 points
Date : 2025-11-04 07:22 UTC (4 days ago)
(HTM) web link (cprimozic.net)
(TXT) w3m dump (cprimozic.net)
| IlikeKitties wrote:
| >As I mentioned, before, I had imagined the network learning some
| fancy combination of logic gates to perform the whole addition
| process digitally, similarly to how a binary adder operates. This
| trick is yet another example of neural networks finding
| unexpected ways to solve problems.
|
| My intuition is that this solution allows for some form of
| gradient approach to a solution, which is why it's unintuitive.
| We think about solutions as all or nothing and look for complete
| solutions.
| arjvik wrote:
| The more interesting question is is it even possible to learn
| the logic gates solution through gradient descent?
| scarmig wrote:
| You could riff off an approach similar to https://google-
| research.github.io/self-organising-systems/di...
| elteto wrote:
| Right, binary gates are discrete elements but neural networks
| operate on a continuous domain.
|
| I'm reminded of the Feynman anecdote when he went to work for
| Thinking Machines and they gave him some task related to
| figuring out routing in the CPU network of the machine, which
| is a discrete problem. He came back with a solution that used
| partial differential equations, which surprised everyone.
| rnhmjoj wrote:
| Original submission:
| https://news.ycombinator.com/item?id=34399142
| drougge wrote:
| This seems interesting, but I got stuck fairly early on when I
| read "all 32,385 possible input combinations". There are two 8
| bit numbers, 16 totally independent bits. That's 65_536
| combinations. 32_285 is close to half that, but not quite.
| Looking at it in binary it's 01111110_10000001, i.e. two 8 bit
| words that are the inverse of each other. How was this number
| arrived at, and why?
|
| Looking later there's also a strange DAC that gives the lowest
| resistance to the least significant bit, thus making it the
| biggest contributor to the output. Very confusing.
| dahart wrote:
| Is that the number of adds that don't overflow an 8-bit result?
|
| On that hunch, I just checked and I get 32896.
|
| Edit: if I exclude either input being zero, I get 32385.
|
| You also get the same number when including input zeros but
| excluding results above 253. But I'd bet on the author's reason
| being filtering of input zeros. Maybe the NN does something bad
| with zeros, maybe can't learn them for some reason.
| jtsiskin wrote:
| Interesting puzzle. 32385 is 255 pick 2. My guess would be, to
| hopefully make interpretation easier, they always had the
| larger number on one side. So (1,2) but not (2,1). And also 0
| wasn't included. So perhaps their generation loop looks like
| [[(i,j) for j (i-1 -> 1) for i (256 -> 1)]
| joshribakoff wrote:
| You are potentially conflating combinations with permutations.
| bob1029 wrote:
| > While playing around with this setup, I tried re-training the
| network with the activation function for the first layer replaced
| with sin(x) and it ends up working pretty much the same way.
|
| There is some evidence that the activation functions and weights
| can be arbitrarily selected assuming you have a way to evolve the
| topology of the network.
|
| https://arxiv.org/abs/1906.04358
| anon291 wrote:
| Very nice. I think people don't appreciate enough the
| correspondence between linear algebra, differential equations,
| and wave behavior.
|
| Roughly speaking, it seems the network is essentially converting
| binary digits to orthogonal basis functions and then manipulating
| those basis functions. Finally a linear transformation back into
| the binary digit space.
| YeGoblynQueenne wrote:
| >> I created training data by generating random 8-bit unsigned
| integers and adding them together with wrapping.
|
| So, binary addition in [0,256] (base 10). Did the author try the
| trained network on numbers outside the training range?
|
| It's one thing to find that your neural net discovered this one
| neat trick for binary addition with 8-bit numbers, and something
| completely different to find that it figured out binary addition
| in the general case.
|
| How hard the latter would be... depends. What were the activation
| functions? E.g. it is quite possible to learn how to add two
| (arbitrary, base-10) integers with a simple regression for no
| other reason than regression being itself based on addition (ok,
| summation).
| xg15 wrote:
| This is really cool and I hope there will be more experiments
| like this.
|
| My takeaway is also that we don't really have a good intuition
| yet how the internal representations of neuronal networks "work"
| or what kind of internal representations can even be learned
| through SGD+backpropagation. (And also how those representations
| depend on the architecture)
|
| Like in this case, where the author first imagined the network
| would learn a logic network, but the end result was more like an
| analog circuit.
|
| It's possible to construct the "binary adder" network the author
| imagined "from scratch" by handpicking the weights. But the
| question would be interesting if it could also be learned or if
| SGD would always produce an "analog" solution like this one.
| bgnn wrote:
| The second step, passing the analog output through shifted tanh
| functions, is implementing an analog to digital converter (ADC).
| This type ADCs were common back in the BJT days.
|
| So: DAC + sum in analog domain+ ADC is what the NN is doing.
| krbaccord94f wrote:
| Binary layer functions, whether for DACs which convert 4-bit or
| 8-bit inputs to a unitary neuron _allows the network to both sum
| the inputs as well as convert the sum to analog all within a
| single layer ... [to] do it all before any [Ameo] activation
| functions even come into play. "_ This is sin-1(tan)x in the
| absence of asymptote.
___________________________________________________________________
(page generated 2025-11-08 23:01 UTC)