[HN Gopher] Problem solving often a matter of cooking up an appr...
___________________________________________________________________
Problem solving often a matter of cooking up an appropriate Markov
chain (2007)
Author : Alifatisk
Score : 188 points
Date : 2025-07-30 12:29 UTC (10 hours ago)
(HTM) web link (math.uchicago.edu)
(TXT) w3m dump (math.uchicago.edu)
| satvikpendem wrote:
| I assume OP watched this excellent Veritasium video about Markov
| and his chains [0] and posted this article which was referenced
| in that video.
|
| [0] https://youtu.be/KZeIEiBrT_w
| Alifatisk wrote:
| You nailed it, I wanted to post the actual article but found
| that they where behind a paywall, and I don't know if pirated
| content is allowed to share here but it's definitely worth a
| read though!
| danlitt wrote:
| It is often not piracy to find open copies of papers; the
| authors often publish them directly (on their personal pages
| or on arxiv). A commenter further down found a public link:
| https://math.uchicago.edu/~shmuel/Network-course-
| readings/Ma...
| stevenAthompson wrote:
| I assumed he read the Illuminatus! Trilogy and wondered where
| Markoff Chaney's name originated... There might be something
| wrong with me, now that I think about it. /s
| reverendsteveii wrote:
| That's where I first learned about them.
|
| NO SMOKING. NO SPITTING. MGMT
| baxtr wrote:
| 3.4 Mio views in 4 days. Wow...
| stronglikedan wrote:
| Veritasium is quality content. Those eyes don't hurt nothing
| either.
| tbalsam wrote:
| They unfortunately recently (last few years) sold out to
| private equity (which tends to glaze over fundamentals and
| tries to pump out massive content using previous brand
| quality to give it credence), so beware of quality in more
| recent vids:
|
| https://youtu.be/hJ-rRXWhElI?si=Zdsj9i_raNLnajzi
| fn-mote wrote:
| Yesterday I was reading comments about how the market
| could pay for research and avoid the "distorting effects"
| of public funding.
|
| Is there any way to get a better outcome for the public
| here, or is "do good stuff then sell out" the way it's
| always going to be?
| edwardbernays wrote:
| What distorting effects of public funding? What about the
| distortionary effects of the market? I'll offer the
| suggestion that what you read is brainrotting private
| market propaganda designed to erode the public
| institutions that make America happier, healthier, and
| wealthier.
| aarond0623 wrote:
| He's had a couple of misleading videos over the last few
| years that finally made me unsubscribe. Specifically the
| lightbulb with a 1 light second wire and the more recent
| video about light taking infinite paths.
|
| There was also the Waymo ad and the Rods from the Gods
| video where he couldn't bother to use a guide wire to aim.
| QuadmasterXLII wrote:
| Private equity baby! not just for shitting up your
| dentists and toy stores anymore
| deadso wrote:
| What was wrong with the 1 light second wire and the light
| taking infinite paths videos?
| keeda wrote:
| +1 I would like to know too. Especially the experimental
| demo of infinite paths -- I'm a complete noob in quantum
| physics, and the video made sense of so many topics I
| "learned" in college but never managed to grok. It'd be
| good to know what the alternative explanation is.
| aarond0623 wrote:
| The first one was portrayed in a clickbait "everything
| you know about electricity is wrong" way. There have been
| several response videos to it that lay out why it's
| misleading that explain it better than I can, but suffice
| to say that the lightbulb does not turn on immediately
| like he claims.
|
| There second one takes a mathematical model for the path
| integral for light and portrays it like that's actually
| what is happening, with plenty of phrases like light
| "chooses" the path of least action that imply something
| more going on. Also, the experiment at the end with the
| laser pointer is awful. The light we are seeing is
| scattering from the laser pointer's aperture, not some
| evidence that light is taking alternate paths.
| ww520 wrote:
| Yeah, the light propagation videos are just high on
| misleading theories.
| teaearlgraycold wrote:
| The 1 light second wire video is kinda set up to
| bamboozle you. But it's still correct and taught me about
| EM.
| javier2 wrote:
| Especially the way he (or the team) responded to the
| criticism they got for doing those <<sponsored content>>
| pieces put me off hard enough to unsubscribe.
| edwardbernays wrote:
| Nah, he transparently accepted money from waymo to peddle
| propaganda. Once somebody takes propaganda money, there's
| no trusting them anymore. From then on out, everything they
| do is in service to propaganda paymasters. Even just doing
| regular, good quality work can only be viewed through the
| lens of acquiring social capital to liquidate into
| financial capital later.
|
| See: Brian Keating licking Eric Weinstein's jock strap in
| public and then offering mild criticism on Piers Morgan.
| pstuart wrote:
| > transparently accepted money from waymo to peddle
| propaganda
|
| If transparent enough (and not from an abhorrent source),
| I'd be ok with his product. He's even allowed to make the
| occasional mistake as long as he properly owns up to it.
|
| Theres been a lot of valuable learning from him and it
| would be a pity to dismiss it all over a single fumble.
| stevenAthompson wrote:
| Here's a direct link to a PDF.
|
| http://math.uchicago.edu/~shmuel/Network-course-readings/Mar...
| tomhow wrote:
| Updated, thanks!
| tantalor wrote:
| (2007)
| Iwan-Zotow wrote:
| And? MC didn't change much, did they?
| taeric wrote:
| Putting the date is often a good affordance to help people
| scanning the headlines know if they have seen this one
| before. It is not, necessarily, meant as a knock against it.
| gertlex wrote:
| Especially with academic papers where publish dates are
| often not easy to find in a super-quick skim.
|
| (there's a copyright 2007 at the bottom of the linked page,
| which isn't explicitly "published in 2007" in my mind)
| postit wrote:
| Markov Chains are a gateway drug to more advanced probabilistic
| graphic models which are worth exploring. I still remember
| working throughout Koller&Friedman cover to cover as one of the
| best learning experiences I've ever had.
| notrealyme123 wrote:
| Just bought it last week. This book just makes me happy.
|
| It feels like bishops pattern recognition but with a clearer
| tone (and a different field of course)
| roadside_picnic wrote:
| PGMs are a great topic to explore (and Koller+Friedman is a
| great book), but, as a word of caution to anyone interested:
| implementation of any of these more advanced models remains a
| _major_ challenge. For anyone building production facing
| models, even if your problem is a pretty good match for the
| more interesting PGMs, the engineering requirements alone are a
| good reason not to go too far down that path.
|
| The PGM book is also structured very clearly for researchers in
| PGMs. The book is laid out in 3 major section: the models,
| inference techniques (the bulk of the book), and learning.
| Which means, if you follow the logic of the book, you basically
| have to work through 1000+ pages of content before you can
| actually start running even toy versions of these models. If
| you do need to get into the nitty-gritty of particular
| inference algorithms, I don't believe there is another textbook
| with nearly the level of scope and detail.
|
| Bishop's section on PGMs from _Pattern Recognition and Machine
| Learning_ is probably a better place to start learning about
| these more advanced models, and if you become very interested
| then Koller+Friedman will be an invaluable text.
|
| It's worth noting that the PGM course taught by Koller was one
| of the initial, and still very excellent, Coursera courses. I'm
| not sure if it's still free, but it was a nice way to get a
| deep dive into the topic in a reasonably short time frame (I do
| remember those homeworks as brutal though!)[0].
|
| 0. https://www.coursera.org/specializations/probabilistic-
| graph...
| yanovskishai wrote:
| Played with Bayesian nets a bit in grad school--Pearl's
| causality stuff is still mind-blowing--but I've almost never
| bumped into a PGM in production. A couple things kept biting
| us: Inference pain. Exact is NP-hard, and the usual hacks
| (loopy BP, variational, MCMC) need a ton of hand-tuning
| before they run fast enough.
|
| The data never fits the graph. Real-world tables are messy
| and full of hidden junk, so you either spend weeks arguing
| over structure or give up the nice causal story.
|
| DL stole the mind-share. A transformer is a one-liner with a
| mature tooling stack; hard to argue with that when deadlines
| loom.
|
| That said, they're not completely dead - reportedly
| Microsoft's TrueSkill (Xbox ranking), a bunch of Google
| ops/diagnosis pipelines, some healthcare diagnosis tools by
| IBM Watson built on Infer.NET.
|
| Anyone here actually shipped a PGM that beat a neural
| baseline? Would really love to appreciate your war stories.
| mindcrime wrote:
| Heh, somebody watched that same Veritasium video about Markov
| Chains and Monte Carlo methods. Great video; lots of interesting
| historical stuff in there that I didn't know (like the feud
| between Markov and Nekrasov).
| carlmcqueen wrote:
| For awhile I was working on Monte Carlo sims in my job in
| finance in my early career. Just re-building a existing archaic
| excel monster in python to be more flexible to new investment
| models and implement and allow for more levers. Since I was
| already working with them daily I begin applying Monte Carlo
| models to a lot more problems I was thinking about. It truly is
| a fun tool to play with, especially when you're in the thick of
| designing them.
| amelius wrote:
| If you're holding a hammer every problem looks like a nail ...
| tempodox wrote:
| Yea, if it were true, then Markov chains on steroids (LLMs)
| would be super at problem solving, but they can't even reason.
| And they get worse if you add random facts about cats:
| https://news.ycombinator.com/item?id=44724238
| CrazyStat wrote:
| That doesn't follow at all, but I guess don't pass up any
| opportunity to bash LLMs.
| ducktective wrote:
| LLMs are Markov Chains on steroids?
|
| By the way, does anyone know which model or type of model was
| used in winning gold in IMO?
| magicalhippo wrote:
| > LLMs are Markov Chains on steroids?
|
| Might be a reference to this[1] blog post which was posted
| here[2] a year ago.
|
| There has also been some academic work linking the two,
| like this[3] paper.
|
| [1]: https://elijahpotter.dev/articles/markov_chains_are_th
| e_orig...
|
| [2]: https://news.ycombinator.com/item?id=39213410
|
| [3]: https://arxiv.org/abs/2410.02724
| roadside_picnic wrote:
| > LLMs are Markov Chains on steroids?
|
| It's not an unreasonable view, at least for decoder-only
| LLMs (which is what most popular LLMs are). While it may
| seem they violate the Markov property since they clearly do
| make use of their history, in practice that entire history
| is summarized in an embedding passed into the decoder.
| I.e.just like a Markov chain their entire history is
| compressed into a single point which leaves them
| conditionally independent of their past given their present
| state.
|
| It's worth noting that this claim is NOT _generally
| applicable_ to LLMs since both encoder /decoder and
| encoder-only LLMs _do violate_ the Markov property and
| therefore cannot be properly considered Markov chains in a
| meaningful way.
|
| But running inference on decoder only model is, at a high
| enough level of abstraction, is conceptually the same as
| running a Markov chain (on steroids).
| naasking wrote:
| Unless the problem is "quantum mechanics", then that reduces to
| non-Markovian processes with unistochastic laws, and ironically,
| this makes for a causally local QM:
|
| https://arxiv.org/abs/2402.16935v1
| Pamar wrote:
| I am on the move so I cannot check the video (but I did skim the
| pdf). Is there any chance to see an example of this technique?
| Just a toy/trivial example would be great, TIA!
| mindcrime wrote:
| For the Monte Carlo Method stuff in particular[1], I get the
| sense that the most iconic "Hello, World" example is using MC
| to calculate the value of pi. I can't explain it in detail from
| memory, but it's approximately something like this:
|
| Define a square of some known size (1x1 should be fine, I
| think)
|
| Inscribe a circle inside the square
|
| Generate random points inside the square
|
| Look at how many fall inside the square but not the circle,
| versus the ones that do fall in the circle.
|
| From that, using what you know about the area of the square and
| circle respectively, the ratio of "inside square but not in
| circle" and "inside circle" points can be used to set up an
| equation for the value of pi.
|
| Somebody who's more familiar with this than me can probably fix
| the details I got wrong, but I think that's the general spirit
| of it.
|
| For Markov Chains in general, the only thing that jumps to mind
| for me is generating text for old school IRC bots. :-)
|
| [1]: which is probably not the point of this essay. For for
| muddying the waters, I have both concepts kinda 'top of mind'
| in my head right now after watching the Veritasium video.
| ilyakaminsky wrote:
| TIL, thanks! I asked Claude to generate a simulator [1] based
| on your comment. I think it came out well.
|
| [1] https://claude.ai/public/artifacts/1b921a50-897e-4d9e-8cf
| a-0...
| jldugger wrote:
| > From that, using what you know about the area of the square
| and circle respectively, the ratio of "inside square but not
| in circle" and "inside circle" points can be used to set up
| an equation for the value of pi.
|
| Back in like 9th grade, when Wikipedia did not yet exist (but
| MathWorld and IRC did) I did this with TI-Basic instead of
| paying attention in geometry class. It's cool, but converges
| hilariously slowly. The in versus out formula is basically
| distance from origin > 1, but you end up double sampling a
| lot using randomness.
|
| I told a college roommate about it and he basically invented
| a calculus approach summing pixels in columns or something as
| an optimization. You could probably further optimize by
| finding upper and lower bounds of the "frontier" of the
| circle, or iteratively splitting rectangle slices in
| infinitum, but thats probably just reinventing state of the
| art. And all this skips the cool random sampling the monte
| carlo algorithm uses.
| nyeah wrote:
| I mean, often, sure. Also problem solving is often a matter of
| making a spreadsheet full of +, -, *, / operations. Problem
| solving is often a matter of counting permutations and
| combinations. It's often a matter of setting up the right system
| of ordinary differential equations. It's often a matter of
| setting up the right linear algebra problem.
|
| It's often a matter of asking the right person what technique
| works. It's often a matter of making a measurement before getting
| lost in the math. It's often a matter of finding the right paper
| in the literature.
| theturtletalks wrote:
| The Veritasium video brought up an interesting point about how
| LLMs, if trained too much on their own content, will fall victim
| to the Markov Chain and just repeat the same thing over and over.
|
| Is this still possible with the latest models being trained on
| synthetic data? And if it possible, what would that one phrase
| be?
| roadside_picnic wrote:
| That original model collapse paper has largely been
| misunderstood and in practice, this is only true if you're not
| curating the generated data at all. The original paper even
| specifies (emphasis mine):
|
| > We find that _indiscriminate_ use of model-generated content
| in training causes irreversible defects in the resulting
| models, in which tails of the original content distribution
| disappear. [0]
|
| In practice _nobody_ is "indiscriminately" using model output
| to fine-tune models since that doesn't even make sense. Even if
| you're harvesting web data generated by LLMs, that data has in
| fact been curated by it's acceptance on whatever platform you
| found it on is a form of curation.
|
| There was a very recent paper _Supervised Fine Tuning on
| Curated Data is Reinforcement Learning (and can be improved)_
| [1] whose content is pretty well summarized by the title. So
| long as the data is curated in some way, you are providing more
| information to the model and the results should improve
| somewhat.
|
| 0. https://www.nature.com/articles/s41586-024-07566-y
|
| 1. https://www.arxiv.org/pdf/2507.12856
|
| edit: updated based on cooksnoot's comment
| cootsnuck wrote:
| There's multiple papers on model collapse. Being able to
| avoid model collapse is different from it "being disproven".
|
| If you just mean its risk has been over exaggerated and/or
| over simplified then yea, you'd have a point.
| roadside_picnic wrote:
| Fair point, I've updated the post to highlight that even
| the original paper specifics "indiscriminate" use of model
| outputs.
|
| Having spent quite a bit of time diving into many
| questionable "research" papers (the original model collapse
| paper is not actually one of these, it's a solid paper),
| there's a very common pattern of showing that something
| does or does not work under special conditions but casually
| making generalized claims about those results. It's so easy
| with LLMs to find a way to get the result you want that
| there are far too many papers out there that people quickly
| take as fact when the claims are much, much weaker than the
| papers let on. So I tend to get a _bit_ reactionary when
| addressing many of these "facts" about LLMs.
|
| But you are correct that with the model collapse paper this
| is much more the public misunderstanding the claims of the
| original paper than any fault with that paper itself.
| AnotherGoodName wrote:
| I feel like we need a video on Dynamic Markov Chains. It's a
| method to create a markov chain from data. It's used in all the
| highest compression winners in the Hutter Prize (a competition to
| compress data the most).
| jadbox wrote:
| Make your own video then :)
___________________________________________________________________
(page generated 2025-07-30 23:00 UTC)