[HN Gopher] My failed attempt at AGI on the Tokio Runtime
___________________________________________________________________
My failed attempt at AGI on the Tokio Runtime
Author : openquery
Score : 69 points
Date : 2024-12-26 16:22 UTC (6 hours ago)
(HTM) web link (www.christo.sh)
(TXT) w3m dump (www.christo.sh)
| cglan wrote:
| I've thought of something like this for a while, I'm very
| interested in where this goes.
|
| A highly async actor model is something I've wanted to explore,
| and combined with a highly multi core architecture but clocked
| very very low, it seems like it could be power efficient too.
|
| I was considering using go + channels for this
| openquery wrote:
| Give it a shot. It isn't much code.
|
| If you want to look at more serious work the Spiking Neural Net
| community has made models which actually work and are power
| efficient.
| jerf wrote:
| The idea has kicked around in hardware for a number of years,
| such as: https://www.greenarraychips.com/home/about/index.php
|
| I think the problem isn't that it's a "bad idea" in some
| intrinsic sense, but that you really have to have a problem
| that it fits like a glove. By the nature of the math, if you
| can only use 4 of your 128 cores 50% of the time, your
| performance just tanks no matter how fast you're going the
| other 50% of the time.
|
| Contra the occasional "Everyone Else Is Stupid And We Just Need
| To Get Off Of von Neumann Architectures To Reach Nirvana" post,
| CPUs are shaped the way they are for a reason; being able to
| bring very highly concentrated power to bear on a specific
| problem is very flexible, especially when you can move the
| focus around very quickly as a CPU can. (Not instantaneously,
| but quickly, and this switching penalty is something that can
| be engineered around.) A lot of the rest of the problem space
| has been eaten by GPUs. This sort of "lots of low powered
| computers networked together" still fits in between them
| somewhat, but there's not a lot of space left anymore. They can
| communicate better in some ways than GPU cores can communicate
| with each other, but that is also a problem that can be
| engineered around.
|
| If you squint really hard, it's possible that computers are
| sort of wandering in this direction, though. Being low power
| means it's also low-heat. Putting "efficiency cores" on to CPU
| dies is sort of, kind of starting down a road that could end up
| at the greenarray idea. Still, it's hard to imagine what even
| all of the Windows OS would do with 128 efficiency cores. Maybe
| if someone comes up with a brilliant innovation on current AI
| architectures that requires some sort of additional cross-talk
| between the neural layers that simply _requires_ this sort of
| architecture to work you could see this pop up... which I
| suppose brings us back around to the original idea. But it 's
| hard to imagine what that architecture could be, where the
| communication is vital on a nanosecond-by-nanosecond level and
| can't just be a separate phase of processing a neural net.
| openquery wrote:
| > By the nature of the math, if you can only use 4 of your
| 128 cores 50% of the time, your performance just tanks no
| matter how fast you're going the other 50% of the time.
|
| I'm not sure I understand this point. If you're using a work-
| stealing threadpool servicing tasks in your actor model
| there's no reason you shouldn't get ~100% CPU utilisation
| provided you are driving the input hard enough (i.e. sampling
| often from your inputs).
| robblbobbl wrote:
| Finally singularity confirmed, thanks.
| dhruvdh wrote:
| I wish more people would just try to do things just like this and
| blog about their failures.
|
| > The published version of a proof is always condensed. And even
| if you take all the math that has been published in the history
| of mankind, it's still small compared to what these models are
| trained on.
|
| > And people only publish the success stories. The data that are
| really precious are from when someone tries something, and it
| doesn't quite work, but they know how to fix it. But they only
| publish the successful thing, not the process.
|
| - Terence Tao (https://www.scientificamerican.com/article/ai-
| will-become-ma...)
|
| Personally, I think failures on their own are valuable. Others
| can come in and branch off from a decision you made that instead
| leads to success. Maybe the idea can be applied to a different
| domain. Maybe your failure clarified something for someone.
| openquery wrote:
| Thank you for saying this. I agree which is why I wrote this
| up.
| markisus wrote:
| > The only hope I have is to try something completely novel
|
| I don't think this is true. Neural networks were not completely
| novel when they started to work. Someone just used a novel piece
| -- the gpu. Whatever the next thing is, it will probably be a
| remix of preexisting components.
| openquery wrote:
| Right. Ironically I chose a model that was around in the 1970s
| without knowing it.
|
| My point was more a game-theoretic one. There is just no chance
| I would beat the frontier labs if I tried the same things with
| less compute and less people. (Of course there is almost 0
| chance I would beat them at all.)
| andsoitis wrote:
| If you're looking for a neuroscience approach, check out Numenta
| https://www.numenta.com/
| namero999 wrote:
| Isn't this self-refuting? From the article:
|
| > Assume you are racing a Formula 1 car. You are in last place.
| You are a worse driver in a worse car. If you follow the same
| strategy as the cars in front of you, pit at the same time and
| choose the same tires, you will certainly lose. The only chance
| you have is to pick a different strategy.
|
| So why model brains and neurons at all? You are outgunned by at
| least 300.000 thousand years of evolution and 117 billion
| training sessions.
| andrewflnr wrote:
| Because bio brains aren't even in the same race.
| dudeinjapan wrote:
| The greatest trick AGI ever pulled was convincing the world it
| didn't exist.
| homarp wrote:
| using https://news.ycombinator.com/item?id=42324444 you could
| make a better joke
|
| Also I was wondering about the source of the original quote,
| https://quoteinvestigator.com/2018/03/20/devil/
| alecst wrote:
| Love the drawings. Kind of a silly question, but how did you do
| them?
| openquery wrote:
| Excalidraw[0] and a mouse and a few failed attempts :)
|
| [0] https://excalidraw.com/
| Onavo wrote:
| > _Ok how the hell do we train this thing? Stochastic gradient
| descent with back-propagation won 't work here (or if it does I
| have no idea how to implement it)._
|
| What's wrong with gradient descent?
|
| https://snntorch.readthedocs.io/en/latest/
| thrance wrote:
| Gradient descent needs a differentiable system, the author's
| clearly not.
| openquery wrote:
| Thanks for sharing. I thought the discontinuous nature of the
| SNN made it non-differentiable and therefore unsuitable for SGD
| and backprop.
| henning wrote:
| The author could first reproduce models and results from papers
| before trying to extend that work. Starting with something
| working helps.
| skeledrew wrote:
| Interesting. I started a somewhat conceptually similar project
| several months ago. For me though, the main motivation is that I
| think there's something fundamentally wrong with the current
| method of using matrix math for weight calculation and
| representation. I'm taking the approach that the very core of how
| neurons work is inherently binary, and should remain that way. My
| basic thesis is that it should reduce computational requirements,
| and lead to something more generic. So I set out to build
| something that takes an array of booleans (the upstream neurons
| either fired or didn't fire at a particular time sequence), and
| gives a single boolean calculated with a customizable activator
| function.
|
| Project is currently on ice as after I created something that
| builds a network of layers, but ran into a wall figuring out how
| to have that network wire itself over time and become
| representative of whatever it's learned. I'll take some time and
| go through this, see what it may spark and try to start working
| on mine again.
| openquery wrote:
| Nice. Interested to see where this leads.
|
| The network in the article doesn't have explicit layers. It's a
| graph which is initialised with a completely random
| connectivity matrix. The inputs and outputs are also wired
| randomly in the beginning (an input could be connected to a
| neuron which is also connected to an output for example, or the
| input could be connected to a neuron which has no post-synaptic
| neurons).
|
| It was the job of the optimisation algorithm to figure out the
| graph topology over training.
___________________________________________________________________
(page generated 2024-12-26 23:01 UTC)