[HN Gopher] Generating Text with Markov Chains
       ___________________________________________________________________
        
       Generating Text with Markov Chains
        
       Author : healeycodes
       Score  : 55 points
       Date   : 2021-02-02 16:24 UTC (6 hours ago)
        
 (HTM) web link (healeycodes.com)
 (TXT) w3m dump (healeycodes.com)
        
       | zhengyi13 wrote:
       | This is a neat introduction to the subject.
       | 
       | If you want to see more of what they can do, and you have had any
       | exposure to Chomsky, you might also appreciate
       | https://rubberducky.org/cgi-bin/chomsky.pl.
        
       | phaedrus wrote:
       | I used to write Markov-based chat bots. Something I thought I
       | observed, but tried and failed to show mathematically, is the
       | possibility of well-connected neighborhoods of the graph
       | (cliques?) leading to some outputs being more likely than their a
       | priori likelihood in the input.
       | 
       | For example, once a simple Markov bot learns the input phrase
       | "had had" it will also start to generate output phrases a human
       | would assign 0% probability to like "had had had" and "had had
       | had had". This in itself isn't a violation of the principles
       | behind the model (it would have to look at more words to
       | distinguish these).
       | 
       | The question is whether more complicated loops among related
       | words can create "thickets" where the output generation can get
       | "tangled up" and generate improbable output at a slightly higher
       | rate than the formal analysis says a Markov model of that order
       | should do for given input frequencies. An example of such a
       | thicket would be something like, "have to have had had to have
       | had".
       | 
       | Essentially, I'm hypothesizing that the percentage values of the
       | weighted probabilities of transitions does not tell the whole
       | story, because the high-level structure of the graph has an add-
       | on effect. A weaker hypothesis is that the state-space of Markov
       | models contains such pathological examples, but that these states
       | are not reachable by normal learning.
       | 
       | Unfortunately I lacked the mathematical chops / expertise to
       | formalize these ideas myself, nor do I personally know anyone who
       | could help me explore these ideas.
        
         | ben_w wrote:
         | I have observed the same phenomena with both simple Markov
         | chains and whatever Apple used for autosuggest on the iPhone.
         | 
         | That said, your specific example reminded me of a specific joke
         | sentence:
         | https://en.m.wikipedia.org/wiki/James_while_John_had_had_had...
        
           | phaedrus wrote:
           | I actually hadn't seen that one (I did know of the "Buffalo
           | buffalo" family of sentences).
           | 
           | Regarding observation of the phenomenon, in a way I first
           | observed the inverse of it. My first proofs of concept didn't
           | learn probabilities, only structure. (One might say all edges
           | had the same probability.) Yet aesthetically they performed
           | just as well as versions which included probability in the
           | algorithm.
        
         | phreeza wrote:
         | So basically to calculate the stationary distribution
         | (distribution over a long time of evaluation) of a markov
         | chain, you have to calculate the eigenvectors of the transition
         | matrix. And these take into account the graph structure, so I
         | think the stationary distribution will be exactly what is
         | calculated this way, there isn't really any space for anything
         | beyond that.
         | 
         | Incidentally, this is very similar to the original pagerank
         | algorithm. The rank roughly corresponds to the probability you
         | will be on any given page if you randomly follow links.
        
       | spiderxxxx wrote:
       | I've done something similar without first learning about Markov
       | Chains. One of my more interesting experiments was creating
       | messed-up laws. I fed it the constitution and alice in
       | wonderland, and it made the most surreal laws. The great thing
       | about them is they don't need to know about language. You could
       | make one to create images, another to create names for cities. I
       | made one to create 'pronounceable passwords'. It took the top
       | 1000 words of a dictionary, and then it would spit out things
       | which could potentially be words, of any length. Of course, the
       | pronounceability of a word like Shestatond is debatable.
        
       | mkaic wrote:
       | thank you for this clear and instructive write up! i appreciate
       | the high-level, concise breakdown.
        
       | not2b wrote:
       | I published a program to do basically this on Usenet in 1987.
       | 
       | Someone created a fixed version on github at
       | 
       | https://github.com/cheako/markov3
        
         | fit2rule wrote:
         | I probably learned Markov from you then.
        
       | monokai_nl wrote:
       | 8 years ago I implemented something like this for your tweets at
       | https://thatcan.be/my/next/tweet/
       | 
       | It still causes a spike of traffic every now and then from
       | Twitter.
        
       | kleer001 wrote:
       | Yup, those are Markov Chains. Not to sound snotty or mean, but...
       | so what? Why should we be interested?
       | 
       | Looks kinda like you followed a tutorial and did a write up.
        
         | gsich wrote:
         | The same result generated with "AI" doesn't get this question.
        
           | kleer001 wrote:
           | Examples?
           | 
           | IMHO they probably should if they're this basic.
        
         | anaerobicover wrote:
         | It's a well-written practical demonstration of an interesting
         | thing that [some people might not know about yet][0], with
         | links for further reading. What's the problem?
         | 
         | [0]:https://xkcd.com/346/
        
           | kleer001 wrote:
           | No problem. Didn't say I had one. Just curious, wanted more
           | information about why OP posted, what made their post
           | special.
           | 
           | Still, didn't seem very novel or deep. Might as well have
           | been someone posting their first timing circuit test,
           | water/air tunnel simulation, or computer vision project. So,
           | good for them. But I thought HN wasn't a glad handing
           | cheerleader chorus.
           | 
           | Searching for Introduction to Markov Chains gets tons of hits
           | at various levels of explanation that are nearly a decade
           | old.
        
           | not_knuth wrote:
           | Also https://xkcd.com/1053/
        
         | healeycodes wrote:
         | I tried to code up Markov chains when I was first learning to
         | program but I found that many resources didn't have clear/terse
         | enough code snippets. Most articles also didn't walk you from
         | zero to understanding.
         | 
         | So I'm trying to fill that gap and help out anyone who is
         | trying to learn the basics.
        
       | inbx0 wrote:
       | For the Finns reading this, there's a Twitter bot "ylekov" [1]
       | that combines headlines from different Finnish news outlets using
       | Markov chains. Sometimes they come out pretty funny
       | 
       | > Suora lahetys virkaanastujaisista - ainakin kaksi tonnia
       | kokaiinia
       | 
       | [1]: https://twitter.com/ylekov_uutiset
        
       | hermitcrab wrote:
       | I wrote 'Bloviate' to mess around with Markov chains. From
       | Goldilocks and the 3 bears it prodoces gems such as:
       | 
       | "Someone's been sitting my porridge," said the bedroom.
       | 
       | You can download it here:
       | https://successfulsoftware.net/2019/04/02/bloviate/
        
       ___________________________________________________________________
       (page generated 2021-02-02 23:01 UTC)