[HN Gopher] A Tiny Boltzmann Machine
       ___________________________________________________________________
        
       A Tiny Boltzmann Machine
        
       Author : anomancer
       Score  : 207 points
       Date   : 2025-05-15 13:41 UTC (9 hours ago)
        
 (HTM) web link (eoinmurray.info)
 (TXT) w3m dump (eoinmurray.info)
        
       | vanderZwan wrote:
       | Lovely explanation!
       | 
       | Just FYI: mouse-scrolling is much too sensitive for some reason
       | (I'm assuming it swipes just fine in mobile contexts, have not
       | checked that). The result is that it jumped from first to last
       | "page" and back whenever I tried scrolling. Luckily keyboard
       | input worked so I could still read the whole thing.
        
       | djulo wrote:
       | that's soooo coool
        
       | nonrandomstring wrote:
       | This takes me back. 1990, building Boltzman machines and
       | Perceptrons from arrays of void pointers to "neurons" in plain C.
       | What did we use "AI" for back then? To guess the next note in a
       | MIDI melody, and to recognise the shape of a scored note, minim,
       | crotchet, quaver on a 5 x 9 dot grid. 85% accuracy was "good
       | enough" then.
        
         | bwestergard wrote:
         | Did the output sound musical?
        
           | nonrandomstring wrote:
           | For small values of "music"? Really, no. But tbh, neither
           | have more advanced "AI" composition experiments I've
           | encountered over the years, Markov models, linear predictive
           | coding, genetic/evolutionary algs, rule based systems, and
           | now modern diffusion and transormers... they all lack the
           | "spirit of jazz" [0]
           | 
           | [0] https://i.pinimg.com/originals/e4/84/79/e484792971cc77ddf
           | f8f...
        
         | gopalv wrote:
         | > recognise the shape of a scored note, minim, crotchet, quaver
         | on a 5 x 9 dot grid
         | 
         | Reading music off a lined page sounds like a fun project,
         | particularly to do it from scratch like 3Blue1Brown's number NN
         | example[1].
         | 
         | Mix with something like Chuck[2] and you can write a completely
         | clientside application with today's tech.
         | 
         | [1] - https://www.3blue1brown.com/lessons/neural-networks
         | 
         | [2] - https://chuck.stanford.edu/
        
       | bbstats wrote:
       | anyone got an archived link?
        
       | tambourine_man wrote:
       | Typo
       | 
       | "The _y_ can be used for generating new data that..."
        
         | munchler wrote:
         | Another typo (or thinko) in the very first sentence:
         | 
         | "Here we introduce introduction to Boltzmann machines"
        
           | croemer wrote:
           | More typos (LLMs are really good at finding these):
           | 
           | "Press the "Run Simulation" button to start traininng the
           | RBM." ("traininng" -> "training")
           | 
           | "...we want to derivce the contrastive divergence
           | algorithm..." ("derivce" -> "derive")
           | 
           | "A visisble layer..." ("visisble" -> "visible")
        
       | nayuki wrote:
       | Oh, this is a neat demo. I took Geoff Hinton's neural networks
       | course in university 15 years ago and he did spend a couple of
       | lectures explaining Boltzmann machines.
       | 
       | > A Restricted Boltzmann Machine is a special case where the
       | visible and hidden neurons are not connected to each other.
       | 
       | This wording is wrong; it implies that visible neurons are not
       | connected to hidden neurons.
       | 
       | The correct wording is: visible neurons are not connected to each
       | other and hidden neurons are not connected to each other.
       | 
       | Alternatively: visible and hidden neurons do not have internal
       | connections within their own type.
        
         | CamperBob2 wrote:
         | _Alternatively: visible and hidden neurons do not have internal
         | connections within their own type._
         | 
         | I'm a bit unclear on how that isn't just an MLP. What's
         | different about a Boltzmann machine?
         | 
         |  _Edit:_ never mind, I didn 't realize I needed to scroll _up_
         | to get to the introductory overview.
         | 
         | What 0xTJ's [flagged][dead] comment says about it being
         | undesirable to hijack or otherwise attempt to reinvent
         | scrolling is spot on.
        
           | nayuki wrote:
           | > I'm a bit unclear on how that isn't just a multi-layer
           | perceptron. What's different about a Boltzmann machine?
           | 
           | In a Boltzmann machine, you alternate back and forth between
           | using visible units to activate hidden units, and then use
           | hidden units to activate visible units.
           | 
           | > What 0xTJ's [flagged][dead] comment says about it being
           | undesirable to hijack or otherwise attempt to reinvent
           | scrolling is spot on.
           | 
           | The page should be considered a slideshow that is paged
           | discretely and not scrollable continuously. And there should
           | definitely be no scrolling inertia.
        
       | sitkack wrote:
       | Fun article on David Ackley https://news.unm.edu/news/24-nobel-
       | prize-in-physics-cited-gr...
       | 
       | Do check out his T2 Tile Project.
        
         | AstroJetson wrote:
         | The key takeaways are that there are lots of people involved
         | with making these breakthroughs.
         | 
         | The value of grad students is often overlooked, they contribute
         | so much and then later on advance the research even more.
         | 
         | Why does America look on research as a waste, when it has move
         | everything so far?
        
           | macintux wrote:
           | It's more accurate to say that businesspeople consider
           | research a waste in our quarter-by-quarter investment
           | climate, since it generally doesn't lead to immediate gains.
           | 
           | And our current leadership considers research a threat, since
           | science rarely supports conspiracy theorists or historical
           | revisionism.
        
       | itissid wrote:
       | IIUC, we need gibbs sampling(to compute the weight updates)
       | instead of using the gradient based forward and backward passes
       | with today's NNetworks that we are used to. Any one understand
       | why that is so?
        
         | ebolyen wrote:
         | Not an expert, but I have a bit of formal training on Bayesian
         | stuff which handles similar problems.
         | 
         | Usually Gibbs is used when there's no directly straight-forward
         | gradient (or when you are interested in reproducing the
         | distribution itself, rather than a point estimate), but you do
         | have some marginal/conditional likelihoods which are simple to
         | sample from.
         | 
         | Since each visible node depends on each hidden node and each
         | hidden node effects all visible nodes, the gradient ends up
         | being very messy, so its much simpler to use Gibbs sampling to
         | adjust based on marginal likelihoods.
        
         | oac wrote:
         | I might be mistaken, but I think this is partly because of the
         | undirected structure of RBMs, so you can't build a
         | computational graph in the same way as with feed-forward
         | networks.
        
       | pawanjswal wrote:
       | Love how this breaks down Boltzmann Machines--finally makes this
       | 'energy-based model' stuff click!
        
       | BigParm wrote:
       | That font with a bit of margin looks fantastic on my phone
       | specifically. Really nailing the minimalist look. What font is
       | that?
        
         | mac9 wrote:
         | "font-family: ui-sans-serif, system-ui, sans-serif, "Apple
         | Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Noto Color
         | Emoji";"
         | 
         | from the css so odds are it's whatever your browser or OS's
         | default sans font is, in my case it's SF Pro which is an Apple
         | font though it may vary if you use a non Apple device.
        
       | nickvec wrote:
       | > Here we introduce introduction to Boltzmann machines and
       | present a Tiny Restricted Boltzmann Machine that runs in the
       | browser.
       | 
       | nit: should "introduction" be omitted?
        
       | antidumbass wrote:
       | The section after the interactive diagrams has no left padding
       | and thus runs off the screen on iOS.
        
       | rollulus wrote:
       | Now the real question: is it you enjoying that nice page or is it
       | a Boltzmann Brain?
       | 
       | https://en.m.wikipedia.org/wiki/Boltzmann_brain
        
         | alganet wrote:
         | It doesn't matter.
         | 
         | It's Decartes demon all over again. Problem solved centuries
         | ago. You can skin it however you want, it's the same problem.
        
           | taneq wrote:
           | We can't really discuss Descartes without first explaining
           | the horse.
        
             | alganet wrote:
             | I don't think there is anything to discuss about Decartes.
        
       | nickvec wrote:
       | Great site! Would be cool to be able to adjust the speed at which
       | the simulation runs as well.
        
       | thingamarobert wrote:
       | This is very well made, and so nostalgic to me! My whole PhD
       | between 2012-16 was based on RBMs and I learned so much about
       | generative ML through these models. Research has come so far and
       | one doesn't hear much about them these days but they were really
       | at the heart of the "AI Spring" back then.
        
       | tomrod wrote:
       | Great read!
       | 
       | One nit, a misspelling in the Appendix: derivce -> derive
        
       | oac wrote:
       | Nice and clean explanation!
       | 
       | It brings up a lot of memories! Shameless plug: I made a
       | visualization of an RBM being trained years ago:
       | https://www.youtube.com/watch?v=lKAy_NONg3g
        
       | dr_dshiv wrote:
       | My understanding is that the Harmonium (Smolensky) was the first
       | restricted Boltzmann machine, but maximized "harmony" instead of
       | minimizing "energy." When Smolensky, Hinton and Rummelhart
       | collaborated, they instead called it "goodness of fit."
       | 
       | The harmonium paper [1] is a really nice read. Hinton obviously
       | became the superstar and Smolensky wrote long books about
       | linguistics.
       | 
       | Anyone know more about this history?
       | 
       | [1]
       | https://stanford.edu/~jlmcc/papers/PDP/Volume%201/Chap6_PDP8...
        
       | Nevermark wrote:
       | I mistook the title for "A Tiny Boltzmann Brain"! [0]
       | 
       | My own natural mind immediately solved the conundrum. Surely this
       | was a case where a very small model was given randomly generated
       | weights and then tested to see if it actually did something
       | useful!
       | 
       | After all, the smaller the model, the more likely simple random
       | generation can produce something interesting, relative to its
       | size.
       | 
       | I stand corrected, but not discouraged!
       | 
       | I propose a new class of model, the "Unbiased-Architecture
       | Instant Boltzmann Model" (UA-IBM).
       | 
       | One day we will have quantum computers large enough to simply set
       | up the whole dataset as a classical constraint on a model defined
       | with N serialized values, representing all the parameters and
       | architecture settings. Then let a quantum system with N qubits
       | take one inference step over all the classical samples, with all
       | possible parameters and architectures in quantum superposition,
       | and then reduce the result to return the best (or near best)
       | model's parametesr and architecture in classical form.
       | 
       | Anyone have a few qubits laying around that want to give this a
       | shot? (The irony that everything is quantum and yet so slippery
       | we can hardly put any of it to work yet.
       | 
       | (Sci-fi story premise: the totally possible case of an alien
       | species that evolved one-off quantum sensor, which evolved into a
       | whole quantum sensory system, then a nervous system, and
       | subsequently full quantum intelligence out of the gate. What kind
       | of society and technological trajectory would they have?
       | Hopefully they are in close orbit around a black hole, so the
       | impact of their explosive progress has not threatened us yet. And
       | then one day, they escape their gravity well, and ...)
       | 
       | [0] https://en.wikipedia.org/wiki/Boltzmann_brain
        
         | immibis wrote:
         | That isn't how quantum computers work.
        
           | Nevermark wrote:
           | My understanding, which is far from certain, is that a
           | superposition of circuit states with a superposition of
           | output values can be sampled with N x log2(N) hardware to
           | most likely return the optimal solution.
           | 
           | Granted, N x log2(N) is very a large amount of hardware for
           | sampling, since N is the number of parameters and
           | architectural setting bits.
           | 
           | I could be wrong, but I believe that is the case.
           | 
           | A more efficient, i.e. practical system would require
           | significant other complexities and iteration.
           | 
           | There are a lot of classical computations that can be done in
           | one pass too - if hardware is free, with enough available for
           | the problem size.
        
         | ithkuil wrote:
         | Poor quantum beings. They don't have access to a computation
         | model that exceeds the speeds of their own thoughts and they
         | are forever doomed to be waiting a long time for computations
         | to happen
        
       ___________________________________________________________________
       (page generated 2025-05-15 23:00 UTC)