[HN Gopher] Differentiable Logic Cellular Automata
___________________________________________________________________
Differentiable Logic Cellular Automata
Author : eyvindn
Score : 413 points
Date : 2025-03-06 23:43 UTC (23 hours ago)
(HTM) web link (google-research.github.io)
(TXT) w3m dump (google-research.github.io)
| thatguysaguy wrote:
| This writing feels so strongly LLM flavored. It's too bad, since
| I've really liked Alexander Mordvintsev's other work.
| owenpalmer wrote:
| Which portion of the text gave you that impression?
| thatguysaguy wrote:
| > To answer this, we'll start by attacking Conway's Game of
| Life - perhaps the most iconic cellular automata, having
| captivated researchers for decades
|
| > At the heart of this project lies... > his powerful
| paradigm, pioneered by Mordvintsev et al., represents a
| fundamental shift in...
|
| (Not only is this clearly LLM-style, I doubt someone working
| in a group w/ Mordvintsev would write this)
|
| > Traditional cellular automata have long captivated...
|
| > In the first stage, each cell perceives its environment.
| Think of it as a cell sensing the world around it.
|
| > To do this, it uses Sobel filters, mathematical tools
| designed to numerically approximate spatial gradients
|
| Mathematical tools??? This is a deep learning paper my guy.
|
| > Next, the neural network steps in.
|
| ...
|
| And it just keeps going. If you ask ChatGPT or Claude to
| write an essay for you, this is the style you get. I suffered
| through it b/c again, I really like Mordvintsev's work and
| have been following this line of research for a while, but it
| feels pretty rude to make people read this.
| kelseyfrog wrote:
| The reason LLMs write like that is, unsurprisingly, that
| some people write like that. In fact many of them do - it's
| not uncommon.
|
| If you have proof like the logits are statistically
| significant for LLM output, that would be appreciated -
| otherwise it's just arguing over style.
| thatguysaguy wrote:
| I've read a _lot_ of deep learning papers, and this is
| extremely atypical. I agree with you that if there were
| any sort of serious implications then it'd be important
| to establish proof, but in the case of griping on a forum
| I think the standard of evidence is much lower.
| Nevermark wrote:
| > in the case of griping on a forum I think the standard
| of evidence is much lower.
|
| Uh, no. Human "slop" is no better than AI slop.
|
| There is no good purpose for a constant hum of
| predictable poorly supported "oh that's LLM" "gripes", if
| we care about the quality of a forum.
| K0balt wrote:
| Yeah, it's disheartening that people often think my
| writing (most of it predates gpt3) is llm, and some of my
| favourite writers also fall under this wet blanket. LLMs
| just copy the most common writing style, so now if you
| write in a common way you are "llm".
| 0xfffafaCrash wrote:
| I've also had my writing misidentified as being LLM-
| produced on multiple occasions in the last month.
| Personally, I don't really care if some writing is
| generated by AI if the contents contain solid arguments
| and reasoning, but when you haven't used generative AI in
| the production of something it's a weird claim to respond
| to.
|
| Before GPT3 existed, I often received positive feedback
| about my writing and now it's quite the opposite.
|
| I'm not sure whether these accusations of AI generation
| are from genuine belief (and overconfidence) or some
| bizarre ploy for standing/internet points. Usually these
| claims of detecting AI generation get bolstered by others
| who also claim to be more observant than the average
| person. You can know they're wrong in cases where you
| wrote something yourself but it's not really provable.
| ziddoap wrote:
| A lot of these are very close to stuff I have written. Not
| saying this piece did or didn't get a pass through an LLM,
| I have no idea, but it really makes me wonder how many
| people accuse me of using an LLM when it's just how I
| write.
|
| I feel awful for anyone going to school now, or will be in
| the future. I probably would have been kicked out, seeing
| how easily people say "LLM" whenever they read some common
| phrasing, a particular word, structure of the writing, etc.
| robwwilliams wrote:
| Ran the entire text through Claude 3.7 to evaluate style.
| Anyone on HN can do the same.
|
| I'd rather hear about the content instead of this meta
| analysis on editorial services. Writers used to have
| professional copy editors with wicked fine-tipped green
| pens. Now we expect more incompetence from humans. Let me
| add some more typos to this comment.
| showmexyz wrote:
| Reasearch papers are written like this and LLMs are trained
| on arxiv.
| K0balt wrote:
| Are you sure you aren't just falling into the "it's all llm"
| trap? A lot of common writing styles are similar, and the most
| common ones are what LLMs imitate. I often am accused of llm
| writing. I don't publish llm text because I think it is a
| social harm, so it's pretty demoralising to have people call
| out my writing as ""llm slop". OTOH, I have a few books
| published and people seem to find them handy, so there's that.
| BriggyDwiggs42 wrote:
| Yup i independently noticed passages with phrases and word
| choice mimicking llms. Certainly just used for assistance
| though, the writing is too good overall.
| 29athrowaway wrote:
| I wonder what Stephen Wolfram has to say about this.
| Rhapso wrote:
| I wish John Conway was still around to comment.
| ekez wrote:
| There's something compelling about these, especially w.r.t. their
| ability to generalize. But what is the vision here? What might
| these be able to do in the future? Or even philosophically
| speaking, what do these teach us about the world? We know a 1D
| cellular automata is Turing equivalent, so, at least from one
| perspective, NCA/these aren't terribly suprising.
| data-ottawa wrote:
| Potentially it would be useful if you could enter a grid from
| satelite images and simulate wildfire spread or pollution
| spread or similar problems.
| emmelaich wrote:
| The self-healing properties suggest biological evolution to me.
| achille wrote:
| these are going to be the dominant lifeforms on earth exceeding
| bacteria, plants and humans in terms of energy consumption
|
| cellular automata that interact with their environment, ones
| that interact with low level systems and high level
| institutions. to some approximation we, humans are just
| individual cells interacting in these networks. the future of
| intelligence aint llms, but systems of automata with metabolic
| aspects. automata that co-evolve, consume energy and produce
| value. ones that compete, ones that model each other.
|
| we're not being replaced, we're just participants in a
| transformation where boundaries between technological and
| cellular systems blur and eventually dissolve. i'm very
| thankful to be here to witness it
|
| see: https://x.com/zzznah/status/1803712504910020687
| ryukoposting wrote:
| I'll have what this guy is smoking. Those visualizations are
| pretty, though.
|
| I can imagine this being useful for implementing classifiers
| and little baby GenAI-adjacent tech on an extremely tiny
| scale, on the order of several hundred or several thousand
| transistors.
|
| Example: right now, a lot of the leading-edge biosensors have
| to pull data from their PPG/ECG/etc chips and run it through
| big fp32 matrices to get heart rate. That's hideously
| inefficient when you consider that your data is usually
| coming in as an int16 and resolution any better than 1bpm
| isn't necessary. But, fp32 is what the MCU can do in hardware
| so it's what you gotta do. Training one of these things to
| take incoming int16 data and spit out a heart rate could
| reduce the software complexity and cost of development for
| those products by several orders of magnitude, assuming
| someone like Maxim could shove it into their existing COTS
| biosensor chips.
| achille wrote:
| yes absolutely: current systems are wildly inefficient. the
| future is one of extreme energy efficiency.
|
| re smoking: sorry let me clarify my statement. these things
| will be the dominant life forms on earth in terms of
| metabolism, exceeding the energy consumption of biological
| systems, over 1k petawatt hours per year, dwarfing
| everything else
|
| the lines betwen us may blur metaphorically, we'll be
| connected to them how we're connected to ecosystems of
| plants and bacteria. these systems will join and merge in
| the same way we've merged with smartphones -- but on a much
| deeper level
| BriggyDwiggs42 wrote:
| Okay so another way to put it is that these are gonna be
| the software we run on lots of computers in the future.
| Why this particular model of intelligence and not some
| other one?
| suddenlybananas wrote:
| So grandiose. It's a good thing to rapture is happening when
| you're alive to see it. You're just that important.
| achille wrote:
| i wasn't around to see the first humans land on the moon. i
| feel a similar deep sense of awe and excitement to see this
| revolution
| ysofunny wrote:
| because the goal of life is to maximize metabolic throughput?
|
| or to minimze energetic waste?
| emmelaich wrote:
| The result checkerboard pattern is the opposite (the NOT) of the
| target pattern. But this is not remarked upon. Is it too
| unimportant to mention or did I miss something?
| itishappy wrote:
| They're learning features, not the exact image (that's why it's
| so good at self healing). It should be invariant to shifts.
| eyvindn wrote:
| thanks for catching this, the figure for the target was
| inverted when exporting for publication, corrected now.
| vessenes wrote:
| Amazing paper, I re-read it in more detail today. It feels
| very rich, like almost a new field of study ---
| congratulations to the authors.
|
| I'm ninjaing in here to ask a q -- you point out in the
| checkerboard initial discussion that the 5(!) circuit game of
| life implementation shows bottom left to top right bias --
| very intriguing.
|
| However, when you show larger versions of the circuit, and in
| all future demonstrations, the animations are top left to
| bottom right. Is this because you trained a different
| circuit, and it had a different bias, or because you forgot
| and rotated them differently, or some other reason? Either
| way, I'd recommend you at least mention it in the later
| sections (or rotate the graphs if that aligns with the
| science) since you rightly called it out in the first
| instance.
| miottp wrote:
| Author here. Thank you! You're seeing that correctly. The
| directional bias is the result of some initial symmetry
| breaking and likely random-seed dependent. The version that
| constructs the checkerboard from the top-right down was
| trained asynchronously, and the one from the bottom-left up
| was trained synchronously. The resulting circuits are
| different.
| robwwilliams wrote:
| I wish we were all commenting about the ideas embedded in this
| paper. It intrigues me, but is out of my comfort zone. Love to
| read more content-related insights or criticisms rather than the
| long thread on the shamefully smooth, engaging, and occasionally
| rote style.
| vessenes wrote:
| I was reminded immediately of Wolfram's exploration of using
| cellular automata to get MNIST recognition results. The
| underlying mechanisms they both use are super different, but
| the ideas seem like strong siblings -- I attach them in my mind
| as saying computational complexity is almost shockingly
| expressive, and finding ways to search around the space of
| computation is pretty powerful.
|
| That said, I put in like 4 minutes skimming this paper, so my
| opinion is worth about the average of any Internet forum
| opinion on this topic.
|
| Anyway, I suggest reading Wolfram as well on this, it's pretty
| provocative.
| JFuzz wrote:
| This is wild. Long time lurker here, avid modeling and simulation
| user-I feel like there's some serious potential here to help
| provide more insight into "emergent behavior" in complex agent
| behavior models. I'd love to see this applied to models like a
| predator/prey model, and other "simple" models that generate
| complex "emergent" outcomes but on massive scales... I'm
| definitely keeping tabs on this work!
| bob1029 wrote:
| This is very interesting. I've been chasing novel universal
| Turing machine substrates. Collecting them like Pokemon for
| genetic programming experiments. I've played around with CAs
| before - rule 30/110/etc. - but this is a much more compelling
| take. I never thought to model the kernel like a digital logic
| circuit.
|
| The constraints of boolean logic, gates and circuits seem to
| create an interesting grain to build the fitness landscape with.
| The resulting parameters can be directly transformed to hardware
| implementations or passed through additional phases of
| optimization and then compiled into trivial programs. This seems
| better than dealing with magic floating points in the billion
| parameter black boxes.
| fnordpiglet wrote:
| Yeah this paper feels profoundly important to me. The ability
| to differentiate automata means you can do backward propagating
| optimization on Boolean circuit designs to learn complex
| discrete system behaviors. That's phenomenal.
| mempko wrote:
| There are a lot of cool ideas here. Maybe a small observation but
| the computation is stateful. Each cell has a memory and
| perception of it's environment. Compare this to say your modern
| NN which are stateless. Has there been any work on statefull LLMs
| for instance?
| throwaway13337 wrote:
| This is exciting.
|
| Michael Levin best posited for me the question of how animal
| cells can act cooperatively without a hierarchy. He has some
| biological experiments showing, for example, eye cells in a frog
| embryo will move to where the eye should go even if you pull it
| away. The question I don't think he could really answer was 'how
| do the cells know when to stop?'
|
| Understanding non-hierarchical organization is key to
| understanding how society works, too. And to solve the various
| prisioner's delimmas at various scales in our self-organizing
| world.
|
| It's also about understanding bare complexity and modeling it.
|
| This is the first time I've seen the ability to model this stuff.
|
| So many directions to go from here. Just wow.
| fc417fc802 wrote:
| > The question I don't think he could really answer was 'how do
| the cells know when to stop?'
|
| I'm likely missing something obvious but I'll ask anyway out of
| curiosity. How is this not handled by the well understood
| chemical gradient mechanisms covered in introductory texts on
| this topic? Essentially cells orient themselves within multiple
| overlapping chemical gradients. Those gradients are constructed
| iteratively, exhibiting increasingly complex spatial behavior
| at each iteration.
| cdetrio wrote:
| Textbook models typically simulate normal development of an
| embryo, e.g. A-P and D-V (anterior-posterior and dorsal-
| ventral) patterning. The question Levin raises is how a
| perturbed embryo manages to develop normally, both "picasso
| tadpoles" where a scrambled face will re-organize into a
| normal face, and tadpoles with eyes transplanted to their
| tails, where an optic nerve forms across from the tail to the
| brain and a functional eye develops.
|
| I haven't thoroughly read all of Levin's papers, so I'm not
| sure to what extent they specifically address the issue of
| whether textbook models of morphogen gradients can or cannot
| account for these experiments. I'd guess that it is difficult
| to say conclusively. You might have to use one of the
| software packages for simulating multi-cellular development,
| regulatory logic, and morphogen gradients/diffusion, if you
| wanted to argue either "the textbook model can generate this
| behavior" or that the textbook model cannot.
|
| The simulations/models that I'm familiar with are quite
| basic, relative to actual biology, e.g. models of drosophila
| eve stripes are based on a few dozen genes or less. But iiuc,
| our understanding of larval development and patterning of C
| Elegans is far behind that of drosophila (the fly embryo
| starts as a syncytium, unlike worms and vertebrates, which
| makes fly segmentation easier to follow). I haven't read
| about Xenopus (the frogs that Levin studies), but I'd guess
| that we are very far from being able to simulate all the way
| from embryo to facial development in the normal case, let
| alone the abnormal picasso and "eye on tail" tadpoles.
| triclops200 wrote:
| I'm not an expert on the actual biological mechanisms, but,
| it makes intuitive sense to me that both of those effects
| would occur in the situation you described from simple
| cells working on gradients: I was one of the authors on
| this paper during my undergrad[1] and the generalized idea
| of an eye being placed on a tail and having nerves routed
| successfully through the body via pheromone gradient is
| exactly the kind of error I watched occur a dozen times
| while collecting the population error statistics for this
| paper. Same thing with the kind of error of a face re-
| arranging itself. The "ants" in this paper have no
| communication except chemical gradients similar to the ones
| talked about with morphogen gradients. I'm not claiming
| it's a proof of it working that way, ofc, but, even simpler
| versions of the same mechanism can result in the same kind
| of behavior and error.
|
| [1]: https://direct.mit.edu/isal/proceedings/alif2016/28/10
| 0/9940...
| cdetrio wrote:
| very interesting, thanks for sharing.
| Jerrrrrry wrote:
| What are Cognitive Light Cones? (Michael Levin Interview)
|
| https://www.youtube.com/watch?v=YnObwxJZpZc
| EMIRELADERO wrote:
| I've been thinking a lot about "intelligence" lately, and I feel
| like we're at a decisive point in figuring out (or at least
| greatly advance our understanding of) how it "works". It seems to
| me that intelligence is an emergent natural behavior, not much
| different than classical Newtonian mechanics or electricity. It
| all seems to boil down to simple rules in the end.
|
| What if everything non-discrete about the brain is just
| "infrastructure"? Just supporting the fundamentally simple yet
| important core processes that do the actual work? What if it all
| boils down to logic gates and electrical signals, all the way
| down?
|
| Interesting times ahead.
| UncleOxidant wrote:
| Is there any code available?
| ysofunny wrote:
| probably not publicly
|
| why would they give their hard work away if they can keep it
| under wraps for greater profit and a worse world riddled with
| scarcity?
| eyvindn wrote:
| colab with all code will be available next week, will add link
| from the article.
| jimbohn wrote:
| I'll be waiting!
| showmexyz wrote:
| Can anybody point out what's special about this?
| achille wrote:
| https://xkcd.com/676 but now much, much more efficient
| showmexyz wrote:
| So is it about learning discrete logic to solve a problem
| rather than have whole modern CPU with all its abstraction to
| solve the given problem?
| phrotoma wrote:
| The impression I got, and I'd be happy to have someone help me
| improve this impression, is that it's a way to craft a CA that
| behaves the way you want as opposed to the traditional approach
| to studying CA's which involves tinkering with the rules and
| then seeing what behaviour emerges.
| showmexyz wrote:
| That's what I think it is about, reverse engineering basic
| rules from the end pattern.
| alex_abt wrote:
| > magine trying to reverse-engineer the complex, often unexpected
| patterns and behaviors that emerge from simple rules. This
| challenge has inspired researchers and enthusiasts that work with
| cellular automata for decades.
|
| Can someone shed some light on what makes this a problem worth
| investigating for decades, if at all?
| achille wrote:
| yes, think of it this way: why is it that bathing the Earth
| with 10^55 Boltzmann constants make it seemingly emit a Tesla?
|
| can we construct a warm winter garment without having to
| manually pick open cotton poppies?
|
| if we place energy in the right location, can we have slime
| mold do computation for us?
|
| how do we organize matter and energy in order to watch a funny
| cat video?
| BriggyDwiggs42 wrote:
| https://writings.stephenwolfram.com/2024/08/whats-really-goi...
|
| One example is that stephen wolfram argues, I think
| compellingly, that machine learning "hitches on to" chaotic
| systems defined by simple rules and rides them for a certain
| number of steps in order to produce complex behaviors. If this
| is true, easily going in the reverse direction could give us
| lots of insight into ML.
| marmakoide wrote:
| Self-plug here, but very related => Robustness and the Halting
| Problem for Multicellular Artificial Ontogeny (2011)
|
| Cellular automata where the update rule is a perceptron coupled
| with a isotropic diffusion. The weights of the neural network are
| optimized so that the cellular automata can draw a picture, with
| self-healing (ie. rebuild the picture when perturbed).
|
| Back then, auto-differentiation was not as accessible as it is
| now, so the weights where optimized with an Evolution Strategy.
| Of course, using gradient descent is likely to be way better.
| elnatro wrote:
| Wouldn't you need a custom non-von-Neuman architecture to
| leverage the full power of CA?
| Legend2440 wrote:
| You can emulate a cellular automata just fine on our existing
| computers.
|
| But you could probably get better performance and power
| efficiency if you built a computer that was more... CA-like.
| e.g. a grid of memory cells that update themselves based on
| their neighbors.
| spyder wrote:
| Hmm.. could this be used for the ARC-AGI challenge? Maybe even
| combine with this recent one:
| https://news.ycombinator.com/item?id=43259182
| eyvindn wrote:
| :)
| mikewarot wrote:
| _If I understand the article correctly,_ this research shows that
| you can compress some 2d image into a circuit design, that if
| replicated _exactly_ many times in a grid, it will spontaneously
| output the desired image.
|
| I'm interested in a nearby, but dissimilar project, almost it's
| reciprocal, wherein you can generate a logic design that is NOT
| uniform, but where every cell is independent, to allow for
| general purpose computing. It seems we could take this work, and
| use it to evolve a design that could be put into an FPGA, and
| make far better utilization than existing programming methods
| allow, at the cost of huge amounts of compute to do the training.
| deadbabe wrote:
| It seems to me this is a concept of how an AGI would store
| memories of things it has seen or sensed and later recall them?
| NeutralForest wrote:
| Can someone ELI5 for a Muggle?
| vessenes wrote:
| Late here, but a few comments: the main idea of the authors was
| to combine differential logic gates (an amazing invention I had
| not heard of) with cellular automata as they say in the paper, or
| more accurately I would say a grid topology of small neural
| networks (cells). The cells get and send information to their
| neighbors.
|
| The idea would be you create some sort of outcome for fitness
| (say an image you want the cells to self organize into, or the
| rules of Conway's game of life), set up the training data, and
| because it's fully differentiable, Bob's your uncle at the end.
|
| Depending on what you think about computational complexity, this
| may or may not shock you.
|
| But since they've been doing gradient descent on differentiable
| _logic gates_ at the end of the day, when the training is done,
| they can just turn each cell into binary gates, think AND OR XOR,
| etc. You then have something that can be used for inference crazy
| fast. I presume it could also be laid out and sent to a fab, but
| that work is left for a later paper. :)
|
| This architecture could do a LOTTT of things to be clear. But
| sort of as a warm up they use all the Conway life start and end
| rules to train cells to implement Conway. Shockingly this can be
| done in 5 gates(!). I note that they mention almost everywhere
| that they hand prune unused gates - I imagine this will
| eventually be automated.
|
| They then go on to spec small 7k parameter or so neural networks
| that when laid out in cells can self organize into different
| black and white or color images, and can even do so on larger
| base grids than they were trained, and are resilient to noise
| being thrown at them. They then demonstrate that async networks
| (each cell updates randomly) can be trained, and are harder to
| train but more resilient to noise.
|
| All this is quite a lot to take in, and spectacular in my
| opinion.
|
| One thing they mention, a lot, is that a lot of hyperparameter
| tuning is required for "harder" problems. I can imagine like 50
| lines of research out of this paper, but one of them would
| certainly be adding stability in to the training process. Arc-AGI
| is mentioned here, and is an awesome idea -- could you get a
| "free lunch" with Arc? Or some of Arc? Different network
| topologies are yet another interesting question, hidden
| information, "backing layers" - e.g. why not give each cell 20
| private cells that info goes out to and comes back in? Why not
| make some of those cells talk to some other cells? Why not send
| radio waves as signals across the custom topology and train an
| efficient novel analog radio? Why not give each cell access to a
| shared "super sized" 100k, 1mmk parameter "thinking node"? What
| would a good topology be for different tasks?
|
| I'll stop here. Amazing paper. Quite a number of PhD papers will
| be generated out of it, I expect.
|
| I'd like to see Minecraft implemented though. Seems possible.
| Then we could have Bad Apple in Minecraft on raw circuits.
| Karrot_Kream wrote:
| Pruning excess gates will be interesting. I know this sort of
| thing generally works with reachability analysis, but I'm
| curious in practice how thorny this will be. Moreover I'm
| curious how "interpretable" the resulting circuits will be.
|
| Either way this research is fantastic. What a result.
| vessenes wrote:
| For sure. I guess you could run static analysis on the gates
| to determine what "hits" and what doesn't -- I'm not a chip
| designer, but I know the tools are super sophisticated, and
| these are, ultimately, very small circuits.
|
| I know that some early AI physics-enabled designs utilized
| "weird" analog features, but at small geometries especially,
| and in real life, everything is analog anyway. If these are
| gate-level, I guess the interpretability questions will be
| literally on assessing logic. There's so many paths to dig in
| here, it's super interesting.
| vessenes wrote:
| an edit -- a black and white checker board can be done in 5
| gates. Conway was more like 350 in the paper, apologies!
| calebm wrote:
| I love playing around with cellular automata for doing art. It's
| amazing what kind of patterns can emerge (example:
| https://gods.art/math_videos/hex_func27l_21.html). I may have to
| try to play with these DLCA.
| j_bum wrote:
| Lovely! Thanks for sharing. Would these patterns keep
| generating indefinitely?
| max_ wrote:
| So this does not need large training data sets like traditional
| models?
|
| The lizard and the Game of life example seem to illustate that
| you only need one data points to create or "reverse" engineer a
| an algorithm that "generates" something Equal to the data point.
|
| How is this different from using a neural network and then over
| fitting it?
|
| Maybe that instead learning trained weights, the Cellular
| Automata learns a combination of logic (a circuit).
|
| So the underlying, problems with over fitting an neural network
| (a model being un able to generalise) still hold for this "logic
| cellular automata"?
| juxtaposicion wrote:
| It's interesting to see how differentiable logic/binary circuits
| can be made cheap at inference time.
|
| But what about the theoretical expressiveness of logic circuits
| vs baselines like MLPs? (And then of course compared to CNNs and
| other kernels.) Are logic circuits roughly equivalent in terms of
| memory and compute being used? For my use case, I don't care
| about making inference cheaper (eg the benefit logical circuits
| brings). But I do care about the recursion in space and time (the
| benefit from CAs). Would your experiments work if you still had a
| CA, but used dumb MLPs?
| scarmig wrote:
| Well, with all 16 logic gates available, they can express all
| Boolean circuits (you could get that even with NAND or NOR
| gates, of course, if you are working with arbitrary as opposed
| to fixed connectivity). And so you could have a 32 bit output
| vector which could be taken as a float (and you could create
| any circuit that computes any bitwise representation of a
| real).
|
| As for efficiency, it would depend on the problem. If you're
| trying to learn XOR, a differentiable logic gate network can
| learn it with a single unit with 16 parameters (actually, 4,
| but the implementation here uses 16). If you're trying to learn
| a linear regression, a dumb MLP would very likely be more
| efficient.
| srcreigh wrote:
| The Conway's game of life example isn't so impressive. The
| network isn't really reverse engineering rules, it's being
| trained on data that is equivalent to the rules. It's sort of
| like teaching + by giving it 400 data points triplets (a,b,c)
| with 1 <= a,b <= 20 and c = a + b.
| Cladode wrote:
| Continuous relaxation of boolean algebra is an old idea with much
| literature. Circuit synthesis is a really well-researched field,
| with an annual conference and competition [1]. Google won the
| competition 2 years ago. I wonder if you have tried your learner
| against the IWLS competition data sets. That would calibrate the
| performance of your approach. If not, why not?
|
| [1] https://www.iwls.org/iwls2025/
| jderick wrote:
| Could this be used to train an LLM? It seems the hidden states
| could be used to learn how to store history.
| sim04ful wrote:
| This is a very interesting paper. Question though: it seems the
| cells gates since they're updated using a "global" gradient
| descent that it isn't truly parallel.
|
| Is there any promise towards a strictly local weight adjustment
| method ?
___________________________________________________________________
(page generated 2025-03-07 23:00 UTC)