[HN Gopher] Problem solving often a matter of cooking up an appr...
       ___________________________________________________________________
        
       Problem solving often a matter of cooking up an appropriate Markov
       chain (2007)
        
       Author : Alifatisk
       Score  : 188 points
       Date   : 2025-07-30 12:29 UTC (10 hours ago)
        
 (HTM) web link (math.uchicago.edu)
 (TXT) w3m dump (math.uchicago.edu)
        
       | satvikpendem wrote:
       | I assume OP watched this excellent Veritasium video about Markov
       | and his chains [0] and posted this article which was referenced
       | in that video.
       | 
       | [0] https://youtu.be/KZeIEiBrT_w
        
         | Alifatisk wrote:
         | You nailed it, I wanted to post the actual article but found
         | that they where behind a paywall, and I don't know if pirated
         | content is allowed to share here but it's definitely worth a
         | read though!
        
           | danlitt wrote:
           | It is often not piracy to find open copies of papers; the
           | authors often publish them directly (on their personal pages
           | or on arxiv). A commenter further down found a public link:
           | https://math.uchicago.edu/~shmuel/Network-course-
           | readings/Ma...
        
         | stevenAthompson wrote:
         | I assumed he read the Illuminatus! Trilogy and wondered where
         | Markoff Chaney's name originated... There might be something
         | wrong with me, now that I think about it. /s
        
           | reverendsteveii wrote:
           | That's where I first learned about them.
           | 
           | NO SMOKING. NO SPITTING. MGMT
        
         | baxtr wrote:
         | 3.4 Mio views in 4 days. Wow...
        
           | stronglikedan wrote:
           | Veritasium is quality content. Those eyes don't hurt nothing
           | either.
        
             | tbalsam wrote:
             | They unfortunately recently (last few years) sold out to
             | private equity (which tends to glaze over fundamentals and
             | tries to pump out massive content using previous brand
             | quality to give it credence), so beware of quality in more
             | recent vids:
             | 
             | https://youtu.be/hJ-rRXWhElI?si=Zdsj9i_raNLnajzi
        
               | fn-mote wrote:
               | Yesterday I was reading comments about how the market
               | could pay for research and avoid the "distorting effects"
               | of public funding.
               | 
               | Is there any way to get a better outcome for the public
               | here, or is "do good stuff then sell out" the way it's
               | always going to be?
        
               | edwardbernays wrote:
               | What distorting effects of public funding? What about the
               | distortionary effects of the market? I'll offer the
               | suggestion that what you read is brainrotting private
               | market propaganda designed to erode the public
               | institutions that make America happier, healthier, and
               | wealthier.
        
             | aarond0623 wrote:
             | He's had a couple of misleading videos over the last few
             | years that finally made me unsubscribe. Specifically the
             | lightbulb with a 1 light second wire and the more recent
             | video about light taking infinite paths.
             | 
             | There was also the Waymo ad and the Rods from the Gods
             | video where he couldn't bother to use a guide wire to aim.
        
               | QuadmasterXLII wrote:
               | Private equity baby! not just for shitting up your
               | dentists and toy stores anymore
        
               | deadso wrote:
               | What was wrong with the 1 light second wire and the light
               | taking infinite paths videos?
        
               | keeda wrote:
               | +1 I would like to know too. Especially the experimental
               | demo of infinite paths -- I'm a complete noob in quantum
               | physics, and the video made sense of so many topics I
               | "learned" in college but never managed to grok. It'd be
               | good to know what the alternative explanation is.
        
               | aarond0623 wrote:
               | The first one was portrayed in a clickbait "everything
               | you know about electricity is wrong" way. There have been
               | several response videos to it that lay out why it's
               | misleading that explain it better than I can, but suffice
               | to say that the lightbulb does not turn on immediately
               | like he claims.
               | 
               | There second one takes a mathematical model for the path
               | integral for light and portrays it like that's actually
               | what is happening, with plenty of phrases like light
               | "chooses" the path of least action that imply something
               | more going on. Also, the experiment at the end with the
               | laser pointer is awful. The light we are seeing is
               | scattering from the laser pointer's aperture, not some
               | evidence that light is taking alternate paths.
        
               | ww520 wrote:
               | Yeah, the light propagation videos are just high on
               | misleading theories.
        
               | teaearlgraycold wrote:
               | The 1 light second wire video is kinda set up to
               | bamboozle you. But it's still correct and taught me about
               | EM.
        
               | javier2 wrote:
               | Especially the way he (or the team) responded to the
               | criticism they got for doing those <<sponsored content>>
               | pieces put me off hard enough to unsubscribe.
        
             | edwardbernays wrote:
             | Nah, he transparently accepted money from waymo to peddle
             | propaganda. Once somebody takes propaganda money, there's
             | no trusting them anymore. From then on out, everything they
             | do is in service to propaganda paymasters. Even just doing
             | regular, good quality work can only be viewed through the
             | lens of acquiring social capital to liquidate into
             | financial capital later.
             | 
             | See: Brian Keating licking Eric Weinstein's jock strap in
             | public and then offering mild criticism on Piers Morgan.
        
               | pstuart wrote:
               | > transparently accepted money from waymo to peddle
               | propaganda
               | 
               | If transparent enough (and not from an abhorrent source),
               | I'd be ok with his product. He's even allowed to make the
               | occasional mistake as long as he properly owns up to it.
               | 
               | Theres been a lot of valuable learning from him and it
               | would be a pity to dismiss it all over a single fumble.
        
       | stevenAthompson wrote:
       | Here's a direct link to a PDF.
       | 
       | http://math.uchicago.edu/~shmuel/Network-course-readings/Mar...
        
         | tomhow wrote:
         | Updated, thanks!
        
       | tantalor wrote:
       | (2007)
        
         | Iwan-Zotow wrote:
         | And? MC didn't change much, did they?
        
           | taeric wrote:
           | Putting the date is often a good affordance to help people
           | scanning the headlines know if they have seen this one
           | before. It is not, necessarily, meant as a knock against it.
        
             | gertlex wrote:
             | Especially with academic papers where publish dates are
             | often not easy to find in a super-quick skim.
             | 
             | (there's a copyright 2007 at the bottom of the linked page,
             | which isn't explicitly "published in 2007" in my mind)
        
       | postit wrote:
       | Markov Chains are a gateway drug to more advanced probabilistic
       | graphic models which are worth exploring. I still remember
       | working throughout Koller&Friedman cover to cover as one of the
       | best learning experiences I've ever had.
        
         | notrealyme123 wrote:
         | Just bought it last week. This book just makes me happy.
         | 
         | It feels like bishops pattern recognition but with a clearer
         | tone (and a different field of course)
        
         | roadside_picnic wrote:
         | PGMs are a great topic to explore (and Koller+Friedman is a
         | great book), but, as a word of caution to anyone interested:
         | implementation of any of these more advanced models remains a
         | _major_ challenge. For anyone building production facing
         | models, even if your problem is a pretty good match for the
         | more interesting PGMs, the engineering requirements alone are a
         | good reason not to go too far down that path.
         | 
         | The PGM book is also structured very clearly for researchers in
         | PGMs. The book is laid out in 3 major section: the models,
         | inference techniques (the bulk of the book), and learning.
         | Which means, if you follow the logic of the book, you basically
         | have to work through 1000+ pages of content before you can
         | actually start running even toy versions of these models. If
         | you do need to get into the nitty-gritty of particular
         | inference algorithms, I don't believe there is another textbook
         | with nearly the level of scope and detail.
         | 
         | Bishop's section on PGMs from _Pattern Recognition and Machine
         | Learning_ is probably a better place to start learning about
         | these more advanced models, and if you become very interested
         | then Koller+Friedman will be an invaluable text.
         | 
         | It's worth noting that the PGM course taught by Koller was one
         | of the initial, and still very excellent, Coursera courses. I'm
         | not sure if it's still free, but it was a nice way to get a
         | deep dive into the topic in a reasonably short time frame (I do
         | remember those homeworks as brutal though!)[0].
         | 
         | 0. https://www.coursera.org/specializations/probabilistic-
         | graph...
        
           | yanovskishai wrote:
           | Played with Bayesian nets a bit in grad school--Pearl's
           | causality stuff is still mind-blowing--but I've almost never
           | bumped into a PGM in production. A couple things kept biting
           | us: Inference pain. Exact is NP-hard, and the usual hacks
           | (loopy BP, variational, MCMC) need a ton of hand-tuning
           | before they run fast enough.
           | 
           | The data never fits the graph. Real-world tables are messy
           | and full of hidden junk, so you either spend weeks arguing
           | over structure or give up the nice causal story.
           | 
           | DL stole the mind-share. A transformer is a one-liner with a
           | mature tooling stack; hard to argue with that when deadlines
           | loom.
           | 
           | That said, they're not completely dead - reportedly
           | Microsoft's TrueSkill (Xbox ranking), a bunch of Google
           | ops/diagnosis pipelines, some healthcare diagnosis tools by
           | IBM Watson built on Infer.NET.
           | 
           | Anyone here actually shipped a PGM that beat a neural
           | baseline? Would really love to appreciate your war stories.
        
       | mindcrime wrote:
       | Heh, somebody watched that same Veritasium video about Markov
       | Chains and Monte Carlo methods. Great video; lots of interesting
       | historical stuff in there that I didn't know (like the feud
       | between Markov and Nekrasov).
        
         | carlmcqueen wrote:
         | For awhile I was working on Monte Carlo sims in my job in
         | finance in my early career. Just re-building a existing archaic
         | excel monster in python to be more flexible to new investment
         | models and implement and allow for more levers. Since I was
         | already working with them daily I begin applying Monte Carlo
         | models to a lot more problems I was thinking about. It truly is
         | a fun tool to play with, especially when you're in the thick of
         | designing them.
        
       | amelius wrote:
       | If you're holding a hammer every problem looks like a nail ...
        
         | tempodox wrote:
         | Yea, if it were true, then Markov chains on steroids (LLMs)
         | would be super at problem solving, but they can't even reason.
         | And they get worse if you add random facts about cats:
         | https://news.ycombinator.com/item?id=44724238
        
           | CrazyStat wrote:
           | That doesn't follow at all, but I guess don't pass up any
           | opportunity to bash LLMs.
        
           | ducktective wrote:
           | LLMs are Markov Chains on steroids?
           | 
           | By the way, does anyone know which model or type of model was
           | used in winning gold in IMO?
        
             | magicalhippo wrote:
             | > LLMs are Markov Chains on steroids?
             | 
             | Might be a reference to this[1] blog post which was posted
             | here[2] a year ago.
             | 
             | There has also been some academic work linking the two,
             | like this[3] paper.
             | 
             | [1]: https://elijahpotter.dev/articles/markov_chains_are_th
             | e_orig...
             | 
             | [2]: https://news.ycombinator.com/item?id=39213410
             | 
             | [3]: https://arxiv.org/abs/2410.02724
        
             | roadside_picnic wrote:
             | > LLMs are Markov Chains on steroids?
             | 
             | It's not an unreasonable view, at least for decoder-only
             | LLMs (which is what most popular LLMs are). While it may
             | seem they violate the Markov property since they clearly do
             | make use of their history, in practice that entire history
             | is summarized in an embedding passed into the decoder.
             | I.e.just like a Markov chain their entire history is
             | compressed into a single point which leaves them
             | conditionally independent of their past given their present
             | state.
             | 
             | It's worth noting that this claim is NOT _generally
             | applicable_ to LLMs since both encoder /decoder and
             | encoder-only LLMs _do violate_ the Markov property and
             | therefore cannot be properly considered Markov chains in a
             | meaningful way.
             | 
             | But running inference on decoder only model is, at a high
             | enough level of abstraction, is conceptually the same as
             | running a Markov chain (on steroids).
        
       | naasking wrote:
       | Unless the problem is "quantum mechanics", then that reduces to
       | non-Markovian processes with unistochastic laws, and ironically,
       | this makes for a causally local QM:
       | 
       | https://arxiv.org/abs/2402.16935v1
        
       | Pamar wrote:
       | I am on the move so I cannot check the video (but I did skim the
       | pdf). Is there any chance to see an example of this technique?
       | Just a toy/trivial example would be great, TIA!
        
         | mindcrime wrote:
         | For the Monte Carlo Method stuff in particular[1], I get the
         | sense that the most iconic "Hello, World" example is using MC
         | to calculate the value of pi. I can't explain it in detail from
         | memory, but it's approximately something like this:
         | 
         | Define a square of some known size (1x1 should be fine, I
         | think)
         | 
         | Inscribe a circle inside the square
         | 
         | Generate random points inside the square
         | 
         | Look at how many fall inside the square but not the circle,
         | versus the ones that do fall in the circle.
         | 
         | From that, using what you know about the area of the square and
         | circle respectively, the ratio of "inside square but not in
         | circle" and "inside circle" points can be used to set up an
         | equation for the value of pi.
         | 
         | Somebody who's more familiar with this than me can probably fix
         | the details I got wrong, but I think that's the general spirit
         | of it.
         | 
         | For Markov Chains in general, the only thing that jumps to mind
         | for me is generating text for old school IRC bots. :-)
         | 
         | [1]: which is probably not the point of this essay. For for
         | muddying the waters, I have both concepts kinda 'top of mind'
         | in my head right now after watching the Veritasium video.
        
           | ilyakaminsky wrote:
           | TIL, thanks! I asked Claude to generate a simulator [1] based
           | on your comment. I think it came out well.
           | 
           | [1] https://claude.ai/public/artifacts/1b921a50-897e-4d9e-8cf
           | a-0...
        
           | jldugger wrote:
           | > From that, using what you know about the area of the square
           | and circle respectively, the ratio of "inside square but not
           | in circle" and "inside circle" points can be used to set up
           | an equation for the value of pi.
           | 
           | Back in like 9th grade, when Wikipedia did not yet exist (but
           | MathWorld and IRC did) I did this with TI-Basic instead of
           | paying attention in geometry class. It's cool, but converges
           | hilariously slowly. The in versus out formula is basically
           | distance from origin > 1, but you end up double sampling a
           | lot using randomness.
           | 
           | I told a college roommate about it and he basically invented
           | a calculus approach summing pixels in columns or something as
           | an optimization. You could probably further optimize by
           | finding upper and lower bounds of the "frontier" of the
           | circle, or iteratively splitting rectangle slices in
           | infinitum, but thats probably just reinventing state of the
           | art. And all this skips the cool random sampling the monte
           | carlo algorithm uses.
        
       | nyeah wrote:
       | I mean, often, sure. Also problem solving is often a matter of
       | making a spreadsheet full of +, -, *, / operations. Problem
       | solving is often a matter of counting permutations and
       | combinations. It's often a matter of setting up the right system
       | of ordinary differential equations. It's often a matter of
       | setting up the right linear algebra problem.
       | 
       | It's often a matter of asking the right person what technique
       | works. It's often a matter of making a measurement before getting
       | lost in the math. It's often a matter of finding the right paper
       | in the literature.
        
       | theturtletalks wrote:
       | The Veritasium video brought up an interesting point about how
       | LLMs, if trained too much on their own content, will fall victim
       | to the Markov Chain and just repeat the same thing over and over.
       | 
       | Is this still possible with the latest models being trained on
       | synthetic data? And if it possible, what would that one phrase
       | be?
        
         | roadside_picnic wrote:
         | That original model collapse paper has largely been
         | misunderstood and in practice, this is only true if you're not
         | curating the generated data at all. The original paper even
         | specifies (emphasis mine):
         | 
         | > We find that _indiscriminate_ use of model-generated content
         | in training causes irreversible defects in the resulting
         | models, in which tails of the original content distribution
         | disappear. [0]
         | 
         | In practice _nobody_ is  "indiscriminately" using model output
         | to fine-tune models since that doesn't even make sense. Even if
         | you're harvesting web data generated by LLMs, that data has in
         | fact been curated by it's acceptance on whatever platform you
         | found it on is a form of curation.
         | 
         | There was a very recent paper _Supervised Fine Tuning on
         | Curated Data is Reinforcement Learning (and can be improved)_
         | [1] whose content is pretty well summarized by the title. So
         | long as the data is curated in some way, you are providing more
         | information to the model and the results should improve
         | somewhat.
         | 
         | 0. https://www.nature.com/articles/s41586-024-07566-y
         | 
         | 1. https://www.arxiv.org/pdf/2507.12856
         | 
         | edit: updated based on cooksnoot's comment
        
           | cootsnuck wrote:
           | There's multiple papers on model collapse. Being able to
           | avoid model collapse is different from it "being disproven".
           | 
           | If you just mean its risk has been over exaggerated and/or
           | over simplified then yea, you'd have a point.
        
             | roadside_picnic wrote:
             | Fair point, I've updated the post to highlight that even
             | the original paper specifics "indiscriminate" use of model
             | outputs.
             | 
             | Having spent quite a bit of time diving into many
             | questionable "research" papers (the original model collapse
             | paper is not actually one of these, it's a solid paper),
             | there's a very common pattern of showing that something
             | does or does not work under special conditions but casually
             | making generalized claims about those results. It's so easy
             | with LLMs to find a way to get the result you want that
             | there are far too many papers out there that people quickly
             | take as fact when the claims are much, much weaker than the
             | papers let on. So I tend to get a _bit_ reactionary when
             | addressing many of these  "facts" about LLMs.
             | 
             | But you are correct that with the model collapse paper this
             | is much more the public misunderstanding the claims of the
             | original paper than any fault with that paper itself.
        
       | AnotherGoodName wrote:
       | I feel like we need a video on Dynamic Markov Chains. It's a
       | method to create a markov chain from data. It's used in all the
       | highest compression winners in the Hutter Prize (a competition to
       | compress data the most).
        
         | jadbox wrote:
         | Make your own video then :)
        
       ___________________________________________________________________
       (page generated 2025-07-30 23:00 UTC)