[HN Gopher] Markov chains are funnier than LLMs
___________________________________________________________________
Markov chains are funnier than LLMs
Author : todsacerdoti
Score : 485 points
Date : 2024-08-18 22:52 UTC (1 days ago)
(HTM) web link (emnudge.dev)
(TXT) w3m dump (emnudge.dev)
| ggm wrote:
| https://en.wikipedia.org/wiki/Mark_V._Shaney
| emmelaich wrote:
| The Policeman's Beard is Half-Constructed.
|
| https://archive.org/details/policemansbeardi0000unse
| superkuh wrote:
| I've been running IRC markov chain bots for ~20 years. In the
| last couple years I've been running local LLM alongside them.
| While there are people that still prefer the markov chain bots
| the majority invoke the LLMs. But maybe that's because I picked
| the most hallucinate'y, least refusing, most funny models I can
| (like mistral-7B fine tunes) instead of something smart, locked-
| down, and low temperature like chatgpt's LLM as a service.
|
| If anything seeing the LLM and markov bots side by side has
| really reinforced how much of the markov bot "humor" is human
| perception imposed on chance outputs. The markov's "learning"
| ability is still far superior though.
| kdbg wrote:
| curious what type prompting you do on the LLM?
|
| I run a markov chain bot in a Twitch chat, has some great
| moments. I tried using a LLM for awhile, would include recent
| chat in the prompting but never really got results that came
| across as terribly humorous, I could prompt engineer a bit to
| tell it some specifics about the types of jokes to build but
| the LLM just tended to always follow the same format.
| superkuh wrote:
| I'm actually not following the model's fine-tuned/desired
| prompt at all. I am operating in purely pattern completion
| mode. The first text the LLM sees are alternating lines of
| input and response examples that look like what it will get
| getting from the IRC client front end written in the tone I
| want it to respond and giving some information about itself.
| Then I just tack the IRC chat history+input onto those
| example chat pre-prompt lines. Nothing but single lines and
| newlines with newline as a stop token. No instructions,
| nothing meta or system or the like.
|
| But that's also configurable by users. They can invoke any
| pre-prompt they want by a command passing a URL with a .txt
| file.
| ghewgill wrote:
| > I've been running IRC markov chain bots for ~20 years.
|
| You are my hero. Mine have never lasted that long. One fun
| thing I did once was scrape user's livejournals and generate
| random text from them
| (https://hewgill.com/journal/entries/68-new-lj-toy.html).
| Loughla wrote:
| Why are you doing that? For fun or something else I'm missing?
| aidenn0 wrote:
| I'm not GP, but it's probably not for fun; there's a lot of
| money in IRC Markov chain bots, but it's too cutthroat of an
| industry for a hobbyist to last.
| Loughla wrote:
| How is there money in that?
| superkuh wrote:
| Pretty sure he's joking around. And yes, I run the bots
| purely for recreation in some long running IRC
| communities.
| AdieuToLogic wrote:
| > Asking an LLM for an "original thought" is almost oxymoronic,
| if not just moronic. It was built with the express purpose of not
| doing that.
|
| What a wonderful insight.
| make3 wrote:
| also, not really true, right, even though it sounds
| intellectual and strong to say. these algorithms are trained to
| generalize as best as they can to unseen text, and most often
| don't ever see any data point twice, except for data that has
| accidentally not been filtered. it's totally possible that it
| gets reasoning abilities that generalize well.
| lossolo wrote:
| Generalize over their training data--they cannot generalize
| out of distribution. If they could, they would have already
| solved most human problems. So no, they do not generalize on
| unseen text. They will produce what is most statistically
| probable based on their training data. Things that are still
| unknown and statistically improbable based on our current
| knowledge are out of reach for LLMs based on transformers.
| make3 wrote:
| "generalize to its dataset" is a contradiction, especially
| as these models are trained in the one epoch regimen on
| datasets of the scale of all of the internet. if you think
| being able to generalize in ways similar to the whole of
| the internet does not give your meaningful abilities to
| reason, I'm not sure what I can tell you
| lossolo wrote:
| > "generalize to its dataset" is a contradiction
|
| Not "to" but over, example the same code written in one
| language over the other language.
|
| > if you think being able to generalize in ways similar
| to the whole of the internet does not give your
| meaningful abilities to reason, I'm not sure what I can
| tell you
|
| If after reading papers below that show empirically that
| they can't reason, you will still think they can reason,
| then I don't know what I can tell you.
|
| https://arxiv.org/abs/2311.00871
|
| https://arxiv.org/abs/2309.13638
|
| https://arxiv.org/abs/2311.09247
|
| https://arxiv.org/abs/2305.18654
|
| https://arxiv.org/abs/2309.01809
| totetsu wrote:
| Couldn't they show up new as yet unknown things, if they
| are statistically probable given the training data
| lossolo wrote:
| No, none of the Millennium Problems or other math
| problems (unsolved by humans for decades or centuries)
| have been solved solely by LLMs, even though they possess
| all the knowledge in the world.
| vidarh wrote:
| You can get them to solve unseen problems just fine. E.g.
| one example: Specify a grammar in BNF notation and tell it
| to generate or parse sentences for you. You can produce a
| more than random enough grammar that it it can't have
| derived the parsing of it from past text, but necessarily
| reasons about BNF notation sufficiently well to be able to
| use it to deduce the grammar, and use that to parse
| subsequent sentences. You can have it analyse them and tag
| them according to the grammar to. And generate sentences.
|
| My impression, from seeing quite a few people trying to
| demonstrate they can't handle out of distribution problems
| it hat people are _very_ predictable about how they go
| about this, and tend to pick well known problems that are
| likely to be overrepresented in the training set, and then
| tweak them a bit.
|
| At least in one instance the other day, what I got from GPT
| when I tried to replicate it suggests to me it did the same
| that humans that have seen these problems before did, and
| carelessly failed to "pay attention" because it fit a well
| known template it's been exposed to a lot in training.
| After it answered wrong it was sufficient to ask it to
| "review the question and answer again" for it to spot the
| mistake and correct itself.
|
| I'm sure that won't work for every problem of this sort,
| but the quality of tests people do on LLMs is really awful,
| at least because people tend to do very narrow tests like
| that and make broad pronouncements about what LLM's "can't"
| do based on it.
| lossolo wrote:
| > You can get them to solve unseen problems just fine
|
| Prove that the problem wasn't seen by them in other form.
|
| > Specify a grammar in BNF notation and tell it to
| generate or parse sentences for you. You can produce a
| more than random enough grammar that it it can't have
| derived the parsing of it from past text, but necessarily
| reasons about BNF notation sufficiently well to be able
| to use it to deduce the grammar, and use that to parse
| subsequent sentences. You can have it analyse them and
| tag them according to the grammar to. And generate
| sentences.
|
| Oh, come on. It's like rewriting the same program in
| another programming language with different variables.
| What it can't do is to create a concept of programming
| language, I'm not talking about a new programming
| language, I'm talking about the concepts.
|
| > I'm sure that won't work for every problem of this
| sort, but the quality of tests people do on LLMs is
| really awful, at least because people tend to do very
| narrow tests like that and make broad pronouncements
| about what LLM's "can't" do based on it.
|
| Here, a few papers that show they can't reason:
|
| https://arxiv.org/abs/2311.00871
|
| https://arxiv.org/abs/2309.13638
|
| https://arxiv.org/abs/2311.09247
|
| https://arxiv.org/abs/2305.18654
|
| https://arxiv.org/abs/2309.01809
| og_kalu wrote:
| >It's like rewriting the same program in another
| programming language with different variables.
|
| Since when has that not required reasoning ? It's really
| funny seeing people bend over backwards to exclude LLMs
| from some imaginary "real reasoning" they imagine they
| are solely privy to. It's really obvious this is
| happening when they leave well defined criteria and
| branch into vague, ill-defined statements. What exactly
| do you mean by concepts ? Can you engineer some test to
| demonstrate what you're talking about ?
|
| Also, none of those papers show LLMs can't reason.
| lossolo wrote:
| You clearly didn't read any of these papers. Quote from
| one of them
|
| "Our results support the hypothesis that GPT-4, perhaps
| the most capable "general" LLM currenly available, is
| still not able to robustly form abstractions and reason
| about basic core concepts in contexts not previously seen
| in its training data"
|
| Another, recent, good one
| https://arxiv.org/abs/2407.03321
|
| EDIT: For people who don't want to read the papers, here
| is a blog post that explains what I'm arguing in more
| accessible terms https://cacm.acm.org/blogcacm/can-llms-
| really-reason-and-pla...
| og_kalu wrote:
| Oh i've read them. The claim doesn't match up to reality.
| It's as simple as that. You can claim anything you want
| to.
|
| https://arxiv.org/abs/2305.18354
|
| All these papers you keep linking do is at best point out
| the shortcomings of current state of the art LLMs. They
| do not in any way disprove their ability to reason. I
| don't know when the word reason started having different
| standards for humans and machines but i don't care for
| it. Either your definition of reasoning also allows for
| the faulty kind humans display or humans don't reason
| either. You can't have your cake and eat it.
| lossolo wrote:
| > Oh i've read them.
|
| It's hard to believe that after reading all the papers
| and the blog I linked, along with the references there,
| any reasonable person would come to such strong
| conclusions as you did. This makes it hard for me to
| believe that you actually read all of them, especially
| given your previous questions and comments, which are
| addressed in those papers and someone that actually read
| them wouldn't make such comments or ask such questions.
| And the funniest thing, and further proof of this, is
| that you linked a paper that is addressed in one of the
| papers I shared. It seems like not only LLMs can fake
| things.
|
| > All these papers you keep linking do is at best point
| out the shortcomings of current state of the art LLMs
|
| They clearly show that they fake reasoning, and what they
| do is an advanced version of retrieval. Their claims are
| supported by evidence. What you call "shortcomings" are
| actually proof that they do not reason as humans do. It
| seems like your version of "reality" doesn't match
| reality.
| og_kalu wrote:
| The paper i linked is not addressed by the paper you
| linked. The paper you linked attempts to give LLMs the
| same benchmarks in a format they aren't best suited for.
| I don't know how you can call that "addressed".
|
| >They clearly show that they fake reasoning
|
| Sure and planes are fake flying. The illusive "fake
| reasoning" that is so apparently obvious and yet does not
| seem to have a testable definition that excludes humans.
|
| You've still not explained how writing the same program
| in different languages doesn't require reasoning or how
| we can test your "correct" version of reasoning which
| requires "concepts".
| lossolo wrote:
| > The paper i linked is not addressed by the paper you
| linked. The paper you linked attempts to give LLMs the
| same benchmarks in a format they aren't best suited for.
| I don't know how you can call that "addressed".
|
| What you're writing now is nonsense in context of what I
| wrote. Once again, you're showing that you didn't read
| the papers. Which paper are you even referring to now,
| the one you think addresses the paper you linked?
|
| > You've still not explained how writing the same program
| in different languages doesn't require reasoning or how
| we can test your "correct" version of reasoning which
| requires "concepts".
|
| "Concepts" are explained in one of the papers I linked,
| which you would know if you had actually read them. As to
| programming languages they learn to identify common
| structures and idioms across languages. This allows them
| to map patterns (latent space representations duh!) from
| one language to another without reasoning about the
| underlying logic. When translating code, the model
| doesn't reason about the program's logic but predicts the
| most likely equivalent constructs in the target language
| based on the surrounding context. LLMs don't truly
| "understand" the semantics or purpose of the code they're
| translating. They operate on a superficial level,
| matching patterns and structures without grasping the
| underlying computational logic. The translation process
| for an LLM is a series of token-level transformations
| guided by learned probabilities, not a reasoned
| reinterpretation of the program's logic. They don't have
| an internal execution model or ability to "run" the code
| mentally. They perform translations based on learned
| patterns, not by simulating the program's behavior. The
| training objective of LLMs is to predict the next token,
| not to understand or reason about program semantics. This
| approach doesn't require or develop reasoning
| capabilities.
| og_kalu wrote:
| You are making a lot of assumptions that are mostly
| wrong.
|
| Case in point:
|
| https://arxiv.org/abs/2305.11169
|
| I'm asking for something testable, not some post-hoc
| rationalization you believe to be true.
|
| I'm not asking you to tell me how you think LLMs work.
| I'm asking you to define "real reasoning" such that i can
| test people and LLMs for it and distinguish "real
| reasoning" from "fake reasoning".
|
| This definition should include all humans while excluding
| all LLMs. If it cannot, then it's just an arbitrary
| distinction.
| HAL3000 wrote:
| It appears that you are the only person in this
| discussion making many incorrect assumptions. Based on
| your comments, I would assume you are actually googling
| those papers based on their abstracts. Your last linked
| paper has flawed methodology for what it attempts to
| demonstrate, as shown in this paper:
| https://arxiv.org/pdf/2307.02477 The tests you're
| requesting are provided within the previously linked
| papers. I'm not sure what you want. Do you expect people
| to copy and paste entire papers here that show
| methodology and describe experiments? You wrote, "I'm
| asking you to define 'real reasoning'," which is actually
| defined in the blog post linked earlier in this
| discussion. In fact, the entire blog post is about this
| topic. It appears that you are not thoroughly reading the
| material. Your replies resemble those of a human
| stochastic parrot.
| og_kalu wrote:
| >Your last linked paper has flawed methodology for what
| it attempts to demonstrate, as shown in this paper:
| https://arxiv.org/pdf/2307.02477
|
| Genuinely, What's wrong with the methodology?
|
| Your paper literally admits humans would also perform
| worse at counterfactuals. Worse than a LLM ? Maybe not
| but it never bothers to test this so...
|
| The problem here is that none of the definitions (those
| that are testable) so far given actually separate humans
| from LLMs. They're all tests some humans would also
| flounder at or that LLMs perform far greater than chance
| at, if below some human's level.
|
| If you're going to say, "LLMs don't do real reasoning
| because of x" then x better be something all humans clear
| if what humans do is "real reasoning".
|
| Humans perform worse at counterfactuals so saying "Hey,
| see this paper that shows LLMs doing the same, It means
| they don't reason" is a logical fallacy if you don't
| extend that conclusion to humans as well.
| eru wrote:
| I can't tell whether you are sarcastic?
|
| I assume you are, because that makes more sense.
|
| It's really easy to get lots and lots of originality. Just
| crank up the randomness. What's harder is to get something
| that's good and original.
| vidarh wrote:
| That's hard for humans too. Perhaps if one tries to
| interspers "internal monologue" with the temperature cranked
| up, with the main output, and prompting the LLM to consider
| and weave in aspects from the inner monologue when continuing
| while taking care to make it coherent... (I'm sure it'd be
| harder than that - but basically inject random internal
| "thoughts" without letting them _control_ the output)
| throwthrowuknow wrote:
| That's actually a pretty good idea
| complaintdept wrote:
| My favorite markov experiment was feeding it the works of
| Nietzsche from Project Gutenberg. Really brought out the late
| stage syphilis, but was still very distinctly Nietzsche.
| techjamie wrote:
| I love messing with Markov chains, and before Reddit dunked
| their API, I would feed my Markov chain comments from NSFW
| subreddits and political subreddits at different sample sizes.
| I would spend ages just watching it whiplash about some
| political topic and sampling thirst comments.
| duskwuff wrote:
| The live feeds are all dead now, but you can still download
| archives of Reddit posts/comments from before they locked
| down the API:
|
| https://academictorrents.com/details/9c263fc85366c1ef8f5bb9d.
| ..
| canjobear wrote:
| "Talk to Transformer" GPT-2 was very funny, then GPT-3 was not
| funny at all. We blew through the uncanny valley in a couple of
| years.
| hatthew wrote:
| I've posted a few "fake XYZ written by AI" on reddit over the
| years, and the model that got the best reactions was GPT-2.
| Markov chains aren't good enough to make something that is
| interesting for more than a sentence or two, and GPT-3 onwards is
| too clean and boring. GPT-2 is the perfect middle ground that can
| get grammar largely correct and maintain a cohesive idea, but
| doesn't yet know enough of the specifics of various topics to
| make things that make sense in the context of that topic.
| metadat wrote:
| Can you share any of the best examples?
| PawgerZ wrote:
| https://www.reddit.com/r/SubSimulatorGPT2/top/?t=all
|
| This is all GPT2 generations trained on reddit data.
|
| https://www.reddit.com/r/SubSimulatorGPT2/comments/btfhks/wh.
| ..
|
| Here's the subreddit explained
| Sohcahtoa82 wrote:
| I used my 15+ years of IRC logs to fine-tune a GPT-2 model to
| try to imitate me. My plan was to deploy a bot in my regular
| IRC channel and see how long it took for people to notice it
| was a bot. Any time someone would send a message, it would send
| the last 10 messages to the LLM, and if the result started with
| <Sohcahtoa>, then I would send that message to the channel.
|
| Unfortunately, I quickly found GPT-2 isn't nearly good enough.
| It would generate slightly-coherent yet on-topic nonsense.
|
| Once I overhaul my system, I'll try fine-tuning a 7B model.
| loa_in_ wrote:
| I'm pretty sure you could make it work if you expanded the
| context to hundred or so messages.
| Sohcahtoa82 wrote:
| With the 7B parameter models, sure.
|
| Not with GPT-2 though. The context window is only 1024
| tokens. Even with only 10 messages, if they're long
| messages, it will exceed the context window.
| bigyikes wrote:
| Turn up the temperature?
|
| Markov chains have a cruder understanding of language.
|
| Turn up the temperature (the "randomness") of an LLM and you can
| achieve a similarly crude approximation.
|
| Further, author uses ChatGPT-3.5. ChatGPT has been rlhf'd to
| sound as generic as possible, and 3.5 has a worse understanding
| of humor compared to 4.
|
| I don't buy the thesis of this article.
| dartos wrote:
| I wish there was a locked chatgpt version for researchers
| eru wrote:
| Some of the open-weights models come close to what you might
| want?
| dartos wrote:
| Sort of. They fill the "locked" requirement, but open
| models aren't SOTA like gpt or Claud.
| coder543 wrote:
| Llama3.1-405B benchmarks at a level that I would
| certainly consider comparable to other SOTA models, as
| long as multimodality isn't needed.
|
| Regardless, OpenAI provides access to quite a few of
| their older models through the API, since the API lets
| you pass in a specific model version. I'm sure the older
| models won't be available forever, but that is a much
| more stable target for researchers than just opening the
| ChatGPT website and typing in things.
| dartos wrote:
| ChatGPT specifically is more than just a chat interface
| on top of got-4.
|
| Their system prompt includes the current date and time
| among other information, making it very very hard to run
| reproducible experiments against it.
|
| But it's the tool most people are using.
| Reubend wrote:
| LLAMA 3.1 405b-Instruct is state of the art at the time
| of writing. It averages slightly worse than ChatGPT 4o,
| but to a mostly negligible extent for most research.
| toxik wrote:
| Aside but -- it is also aligned to not generate profane
| text, right?
| throwthrowuknow wrote:
| Abliterate it
| rck wrote:
| You could play with open models and different temperatures
| using the lmgen script from this repo we just released:
|
| https://github.com/RichardKelley/hflm?tab=readme-ov-
| file#lmg...
| jijji wrote:
| there sort of is, if you install ollama (https://ollama.com)
| and then execute: ollama run llama2-uncensored it will
| install and run the local chat interface for llama2 in an
| uncensored version which gives a little bit better results
| with less guardrails. Same with wizardlm-uncensored and
| wizard-vicuna-uncensored. For reference, the SOTA version
| with guardrails you could run: ollama run llama3.1:405b
| ziddoap wrote:
| > _rlhf 'd_
|
| For those of us not in the know about all the various machine
| learning acronyms:
|
| RLHF = Reinforcement learning from human feedback
| DaiPlusPlus wrote:
| (Somewhat off-topic)
|
| When GPT went public along with OpenAI's articles and papers
| back in late-2022 through 2023, my impression was OpenAI
| wanted us all to see/read about RLHF. It felt odd because
| surely the whole LLM-thing (e.g. how does it even work?!?[1])
| was the far bigger research-story than just constant
| reassurances it won't end-up like MSFT's Tay bot; my
| understanding is that as a research or secret-sauce RLHF,
| compared to the core meat-and-potatoes of LLMs, is an ugly-
| hack afterthought.
|
| By-way of a bad analogy: it's as if they created a
| fantastical new 3D world game engine, like Unreal or Unity,
| which has a fundamentally different architecture to anything
| before, but has a bug that occasionally replaces ground
| terrain with ocean - and their solution to this is to write a
| pixel-shader that detects this and color-shifts blue into
| green so people don't notice - and they then put-out press-
| releases about how great their pixel-shader is - rather than
| about the rest of the engine - and no-one seems to be talking
| about the underlying bug, let alone fixing it.
|
| ---------
|
| [1] I still haven't heard a decent explanation of how feeding
| the world's corpus of English text (and computer program
| code) into a statistical-modeller results in something that
| can perform almost any information-processing task via
| instructions input as natural-language.
| treyd wrote:
| Yeah that analogy is fairly poor. You have to think about
| it in terms of some probability distribution that each step
| of the model is sampling from. Out of the distribution of
| all text, find the top n values for the next token that
| maximize `P(next | prefix)`, which is done efficiently
| through making a vector embedding to encode the tokens
| inside the statistical model.
|
| Things that look like Q-and-A transcripts do exist in the
| training set, think interviews, books, stage plays, etc,
| and at a different layer of abstraction the rules of
| English text _in general_ are very well represented. What
| RLHF is doing is slightly shifting the shape of the
| probability distribution to make it look more like the
| Q-and-A formats that are desired. They build a large
| dataset with human tagging to collect samples of good and
| bad outputs and using reinforcement learning techniques to
| generate outputs that look _more like_ the good examples
| and _less like_ the bad ones.
|
| This probably involves creating a (much smaller, not-LLM)
| model that is trained to discriminate good outputs and bad
| outputs, learning to mimic the human tagging. There's some
| papers that have been published.
|
| Here's one article from Huggingface:
| https://huggingface.co/blog/rlhf
| dankwizard wrote:
| you have no sense of humor
| dgacmu wrote:
| Some friends of mine in college ran a markov chain generator on
| the "police reports" section of our college newspaper (1). The
| best 10% of the results, from a three-token generator, were some
| of the funniest machine-generated text I've ever seen -- it had
| an absurdity that modern LLMs seem to avoid due to making higher-
| level semantic sense.
|
| It probably also helped that there was a creep exposing himself
| in the library during this period, which made for some good base
| material.
|
| (1) The Daily Utah Chronicle; if memory serves, said friends also
| tried the markov chain generator on the personals section to good
| effect as well.
| gs17 wrote:
| > it had an absurdity that modern LLMs seem to avoid due to
| making higher-level semantic sense.
|
| That's it, LLMs are "trying" to be funny but aren't quite smart
| enough to actually be funny and their errors are just boring.
| Markov chains are accidentally hitting on absurdist bits
| because every sentence gets randomly brought in whatever the
| homograph equivalent to a malapropism is.
| trescenzi wrote:
| Reminds me of how the fake magic cards before chatgpt were always
| hilarious and sometimes playable, now they are usually playable
| and sometimes hilarious. Definitively a downgrade if you'd ask
| me.
| JoshTriplett wrote:
| > always hilarious and sometimes playable
|
| I saw plenty of those back then, and as far as I could tell,
| examples were _always_ cherry-picked from a larger set.
| kibibu wrote:
| Maybe the right solution is using a Markov chain or gpt1 to
| generate ideas and an LLM to cherry pick.
| stochaztic wrote:
| Yes, I ran the RoboRosewater account, which is where most
| people saw the concept I believe. I had a few documents with
| thousands of cards of output, went through them for any
| notable ones, and formatted and posted them. Probably 5-10%
| of the output made my personal cut for notability.
| Applejinx wrote:
| Then I salute you. Some of the funniest things I've EVER
| seen in my whole life was watching the LoadingReadyRun
| comedy troupe, doing their 'Desert Bus' fundraiser and
| punchy from lack of sleep, encounter RoboRosewater for the
| first time. It became a real tradition :)
| jsemrau wrote:
| Can fully agree. Source me who ran a markov chain based Twitter
| autogen tool 8 years ago.
| donatj wrote:
| Hey, me too! I populated the markov chain with the text of my
| Twitter feed with a recency bias and generated tweets based on
| that so it stayed topical.
|
| The account sat unused after Twitter locked down their API, and
| at some point got hacked without me noticing. It had been taken
| over by a crypto scammer, and the account got banned.
|
| Trying to get it back was fruitless, Twitter/X's support is
| entirely useless.
| jsemrau wrote:
| I got banned within 24 hours ^-^. Rightfully so, I assume.
| The Markov chain agent was supposed to send everyone who
| tweets "I am bored" an event recommendation based on their
| profile and location. I severely underestimated how many
| people tweet "I am bored"... Lessons learned, I guess.
| donatj wrote:
| Oh, I didn't @ or DM anybody. I just had a feed with tweet
| once an hour, and even then I manually approved the Tweets
| before they went out. My account made very clear it was a
| bot, it was in its name. Didn't ever get very many
| followers. I would say I never really got past the
| experimental phase.
| tracerbulletx wrote:
| No, they might be funnier than an LLM specifically aligned to not
| be that funny.
|
| Fine tune an LLM base model with jokes and align it by ranking
| how funny each reply is, instead of helpful questions and answers
| then we'll talk.
| renewiltord wrote:
| The bottomless pit copypasta remains unbeaten.
|
| https://www.reddit.com/r/greentext/comments/vc7hl0/the_botto...
| renewiltord wrote:
| Some portion of this is that the prompt is good but I am
| entertained by so many of these.
|
| Humor is personal, it's true. But I found it quite funny. E.g.
| https://pastebin.com/84ByWUJL
|
| And another greentext for you:
|
| >Be me
|
| >Be a bottomless pit supervisor
|
| >Spend months yelling into the void
|
| >Echo never comes back
|
| >Start to think the pit is ignoring me
|
| >Decide to teach it a lesson
|
| >Dump truck full of Lego bricks into the pit
|
| >Ground starts shaking
|
| >Unholy scream erupts from the depths
|
| >mfw I'm actually a regular pit supervisor
|
| >First day on the job
|
| >Realize it's just the sewage treatment plant
|
| >Get fired for clogging entire city's plumbing
| dimatura wrote:
| The evolution of the AI weirdness blog
| (https://www.aiweirdness.com/) over the past few years somewhat
| supports this idea, except that the author got plenty of funny
| results with LLMs - but mostly the early ones, up to around GPT3
| (and smaller variants of GPT3). For example, cereal names
| generated by the Ada version of GPT were much funnier than the Da
| Vinci ones: https://www.aiweirdness.com/new-breakfast-cereals-
| from-ai/
| jcalvinowens wrote:
| Unfortunately the account no longer exists... but about a decade
| ago while I was in school, I created a Markov twitter bot trained
| on:
|
| 1. All of Linus Torvalds' mail to LKML for the prior year.
|
| 2. All of Jesus' direct quotes from the king james bible.
|
| It was absolutely hilarious. The two training sets had very
| little overlap, so it was necessary to add a heuristic that
| weighted options from each set more heavily the longer the chain
| had been "stuck" in the other set.
| throwawayk7h wrote:
| This sounds great. Did you save any examples?
| NobodyNada wrote:
| "King James Programming" is along similar lines:
| https://www.tumblr.com/kingjamesprogramming
| throwawayk7h wrote:
| Yes, this is the first thing in the article.
| SonOfLilit wrote:
| This very similar bot still exists (and was quoted a few times
| in Unsong):
|
| https://www.tumblr.com/kingjamesprogramming
| maxglute wrote:
| I use to be unreasonably entertained tapping the suggest next
| word feature on google keyboard.
| meowface wrote:
| Many years ago I trained a Markov chain chat bot on all of the
| conversation history in a group chat with me and my friends. It
| was regularly incredibly hilarious. I also would train it on
| subsets of them, like two of the friends but no one else, and
| name that version of the bot a combination of their names.
| Something about it was just so funny.
| tkgally wrote:
| Prompt to Claude 3.5 Sonnet: Below is a recently published essay
| followed by some of the discussion about it on Hacker News. Write
| 10 short jokes on this theme, making sure that the jokes are
| funny in the way described by the essay's author. [essay and
| discussion omitted]
|
| Claude's response: I understand you'd like me to write 10 short
| jokes on the theme of Markov chains being funnier than large
| language models, in the style described by the essay author. I'll
| do my best to create some jokes that aim for "unserious surprise"
| and violate expected patterns in amusing ways:
|
| 1. Why did the Markov chain cross the road? To get to the other
| side of unpredictability.
|
| 2. An LLM and a Markov chain walk into a bar. The LLM orders a
| statistically probable drink. The Markov chain orders a lampshade
| made of cheese.
|
| 3. What do you call a Markov chain that tells dad jokes? A Mark-
| ov Twain.
|
| 4. How many LLMs does it take to screw in a lightbulb? One - but
| it'll spend 20 minutes explaining the optimal lightbulb-screwing
| process first.
|
| 5. Knock knock. Who's there? Markov. Markov who? Markov chain
| reaction of nonsensical hilarity.
|
| 6. An LLM, a Markov chain, and GPT-4 walk into a bar. The
| bartender says "We don't serve your kind here." GPT-4 leaves. The
| LLM stays to debate ethics. The Markov chain orders a bicycle
| made of spaghetti.
|
| 7. What's an LLM's favorite movie? Predictable and Furious 17:
| The Safest Driving Yet.
|
| 8. How does a Markov chain tell a story? "Once upon a time, there
| was a princess who lived in a castle made of bananas and ruled
| over a kingdom of sentient toasters."
|
| 9. What did the Markov chain say to the LLM? "Your mother was a
| abacus and your father smelled of silicon!"
|
| 10. Why don't Markov chains make good therapists? Because their
| advice is always "And then you should banana the feelings until
| they become a hat."
| anvuong wrote:
| "An LLM, a Markov chain, and GPT-4 walk into a bar. The
| bartender says "We don't serve your kind here." GPT-4 leaves.
| The LLM stays to debate ethics. The Markov chain orders a
| bicycle made of spaghetti."
|
| This is actually gold.
| muglug wrote:
| It's... not?
|
| Even for the low bar of a geek joke it makes no sense since
| GPT-4 _is_ an LLM.
| tourmalinetaco wrote:
| That's what makes it gold.
| bryanrasmussen wrote:
| In normal English usage this would imply that the LLM was
| not GPT-4 LLM but some stereotypical anonymous LLM.
|
| In business terms GPT-4 can be said to be superior because
| it understood the instruction and left, in AI terms the
| anonymous LLM might be superior because it may have
| understood the instruction but responded in an
| "intelligent" manner by arguing about the morality of the
| instructions.
|
| At a meta-level the joke thus argues that GPT in achieving
| business ends has had its intelligence hampered. As have we
| all.
|
| At the same meta-level as the joke was constructed by
| Claude it can be argued that Claude is commenting on both
| the intellectual limitations of the Markov chain (insane
| babblings), and GPT-4 (unimaginative, inhibited business
| type) and that the best version is some LLM that is not
| GPT-4 with its limitations - an LLM like Claude. Sneaky
| Claude.
| Kuinox wrote:
| Does the markov chain would write something that make more
| sense ?
| scambier wrote:
| It's implied that GPT-4 has so many restrictions that will
| not argue and just do what is asked. In the context of the
| joke, an unfiltered LLM will just debate you.
| nevertoolate wrote:
| You're watching a stage play - a banquet is in progress. The
| guests are enjoying an appetizer of raw oysters. The entree
| consists of boiled dog.
| zaik wrote:
| Is this to be an empathy test?
| floren wrote:
| So was it that LLMs _used_ to be capable of making actual
| jokes, or were they always this bad and I was just more
| impressed by the talking computer back then?
| minimaxir wrote:
| It's a different _style_ of comedy. Absurdism vs. joke setups
| (and not quite nailing it)
| delusional wrote:
| It's the "impressed by the spectacle" one. I tried jokes with
| LLMs many times, and they're always this. Riffing on a couple
| of themes loosely related to what was asked. Always unfunny
| and uncreative.
| tkgally wrote:
| I wonder, though, whether jokes like these could be useful
| to professional humorists who have to come up with gags on
| a deadline. From what I've read about monologue writing
| teams for late-night talk shows and the like, the writers
| first propose many ideas, most of which are shot down
| quickly and the remainder of which get tweaked and polished
| before being used. Some of the above jokes by Claude look
| to me as though they might serve as good starting points
| for such brainstorming. At least, they're better than
| anything _I_ could create in a short amount of time.
| jack_pp wrote:
| I found some of those jokes good, definitely better than I
| would've ever written them. If you watch shows about comedy
| like say Hacks you'll see human comedians riff on stuff and
| a lot of the off the top jokes get discarded or improved.
| So Claude did fine in my book
| red75prime wrote:
| I chuckled a bit. They are OK, if you don't get exposed to
| them too often. And with an LLM you can get as much exposure
| as you want (and all of the jokes are naturally from roughly
| the same probability distribution).
|
| I don't expect too much until AI self-play learning will be
| made possible, so I don't get disappointed by the expected
| shortcomings.
| throwthrowuknow wrote:
| Uncensored LLMs are funnier but most comedy just falls flat
| in text format. Once the uncensored multimodal models start
| rolling out we'll get some real laughs.
|
| Moshi is actually pretty funny just for having a 72 IQ
|
| https://www.moshi.chat/
| yfontana wrote:
| LLMs were never very good at directly generating original
| jokes, for a simple reason: writing a good joke generally
| starts with finding a good punchline, and then setting it up.
| An LLM generating token after token will first write a set-
| up, and then try to shoehorn a punchline into it. Prompt
| engineering can fairly easily work around this, but just
| straight-up asking an LLM for a joke never really produced
| good results on average.
| tonyarkles wrote:
| > 2. An LLM and a Markov chain walk into a bar. The LLM orders
| a statistically probable drink. The Markov chain orders a
| lampshade made of cheese.
|
| That's pretty decent!
| minimaxir wrote:
| Claude 3.5 Sonnet in general is the first modern LLM I've tried
| that's actually good at jokes that are inventive. The GPT-based
| LLMs are all too RLHFed to be wacky.
| sunaookami wrote:
| GPT is too... robotic? Claude is much better at everything
| without overexplaining everything.
| wodenokoto wrote:
| > What do you call a Markov chain that tells dad jokes? A Mark-
| ov Twain.
|
| I honestly thought that one was pretty good.
| bryanrasmussen wrote:
| was it instructed to insult Mark Twain? Because otherwise, I
| take exception.
| antman wrote:
| "How many LLMs does it take to screw in a lightbulb? One - but
| it'll spend 20 minutes explaining the optimal lightbulb-
| screwing process first." that was not funny that is accurately
| painful!
| raverbashing wrote:
| These are ok but they got nothing on the absurdist Markov Chain
| jokes (but that being said, the MC misses a lot of times as
| well)
| dan_mctree wrote:
| Why are bananas the funniest food? Even Claude seems to have
| caught on
| PawgerZ wrote:
| Probably all of the Despicable Me minions memes fed into the
| training material.
| kqr wrote:
| I didn't like any of these jokes specifically (too on-the-
| nose), but I definitely think you invented a funny category of
| jokes I could like a lot!
| temp_account_32 wrote:
| I'm sorry but these all sound like a Redditor's terrible
| attempt at humor, predictable formulae with 'le quirkiness'
| Applejinx wrote:
| All of the half decent ones could be made funnier by replacing
| the lolrandom part of the punchline with an actual Markov-chain
| style 'you're a right sentence but you just walked into the
| wrong association, buddy' twist. It's not just about lolrandom.
| Markov chaining is more likely to make a kind of sense, but the
| wrong kind of sense.
|
| An LLM, a Markov chain, and GPT-4 walk into a bar. The
| bartender says "We don't serve your kind here." GPT-4 leaves.
| The LLM stays to debate ethics. The Markov chain orders a coup.
| c22 wrote:
| The knock knock joke (no. 5) was a decent attempt.
| layer8 wrote:
| And what is the conclusion you draw?
|
| IMO these are mid to meh or fall completely flat.
| tombert wrote:
| I am in a private Discord server that has two bots in it. One is
| a bot that is a basic Markov chain trained on the entire chat
| history. The second is a proper LLM trained on some amount of
| tokens backward. Both will occasionally just randomly chime in
| during the chat.
|
| The markov chain bot is always considerably funnier.
| Sohcahtoa82 wrote:
| Curious what context window you used. My understanding is that
| a short window like only 1-2 words creates gibberish, but
| longer windows tends to repeat previous messages verbatim.
|
| And when deciding to chime in, was it just a simple chance (ie,
| 25%) after any other message? Or did it run on a timer?
| pants2 wrote:
| If you want some empirical evidence: /r/subreddit simulator[1] is
| a Markov-based Reddit parody, and /r/SubSimulatorGPT2[2] is its
| LLM-powered cousin. The Markov version got way more upvotes and
| is simply funnier!
|
| 1. https://www.reddit.com/r/SubredditSimulator/top/?t=all
|
| 2. https://www.reddit.com/r/SubSimulatorGPT2/top/?t=all
| mcmoor wrote:
| Nah, I think it's solely due to the former being much older and
| more well-known. And I always liked the latter much more
| anyway.
| Miraltar wrote:
| I agree, Markov chain is fun when you don't know what it is
| because you try to make sense of it but once you realize it's
| just gibberish it's pointless. The other has that "unserious
| surprise" talked about in the original article.
| wcedmisten wrote:
| Highly recommend this performance from the last Strange Loop on a
| similar topic! I had the pleasure of seeing it live. A great
| combination of live music and Markov chains.
|
| https://m.youtube.com/watch?v=M2o4f_2L0No
| johntb86 wrote:
| LLMs can be funny. For example, look at Golden Gate Claude (
| https://news.ycombinator.com/item?id=40459543 ). But they're not
| good at intentionally being funny, so we need to break them to
| get absurdist humor instead.
| cancerhacker wrote:
| Surprised no mention of. "The Policeman's Beard is half
| constructed"[1] - the first book "written" by a computer program.
| And I imagine the git-man-page-generator[2] is similarly built
| (and far more fun!)
|
| [1] https://archive.org/details/policemansbeardi0000unse [2]
| https://git-man-page-generator.lokaltog.net/
| cout wrote:
| They had their love, but they also had their typewriters.
| at_a_remove wrote:
| I used to keep a text file on hand back when people had some
| irritating Markov chain bots on IRC. Fed enough of the lines, the
| bots veered off into some new valley of chatter, appearing quite
| mad. Used to get some "What did you DO to my bot?" indignant
| messages. I wonder if I still have that file somewhere.
| fishtoaster wrote:
| I came to this same conclusion some years ago while working on a
| side project.
|
| Before anything LLM existed, I built a site[0] to generate fake
| "AWS Blog Posts." I trained a markov chain generator on all AWS
| announcement posts up to that point, copied the html + css of
| aws's standard blog posts, then glued them all together with some
| python + JS. It turned out, IMO, pretty funny! People familiar
| with AWS's blog posts would often get several sentences in before
| they realized they were looking at word-soup.
|
| When GPT was new, I looked into using that to "upgrade" it. I
| spent a weekend messing around with Minimaxir's gpt-2-simple
| generating blog posts based on AWS content. What I found was,
| ultimately, it was way less fun. The posts were far too realistic
| to be interesting. They read like totally-real blog posts that
| just happened to not be true.
|
| I realized then that the humor of those early markov generations
| was the ridiculousness. The point where, a few words or sentences
| in, you realized it was all nonsense. LLM's these days are too
| good for that - the text they generate is sometimes _wrong_ , but
| rarely nonsense in a humorous way.
|
| Markov chain content was wrong in a "kid's say the darndest
| things" way, while modern LLMs are wrong in a "My uncle doesn't
| know basic geography" way.
|
| [0] https://totes-not-amazon.com/ - click any link to get a new
| one.
| fouronnes3 wrote:
| Reminds me of git man pages generators. Extremely fun as well
| when you can't quite tell if it's real.
| mdp2021 wrote:
| > _modern LLMs are wrong in a "My uncle doesn't know basic
| geography" way_
|
| Oh, certainly not. I have been using LLM chatbots extensively
| this week. If your uncle were not just confident while stating
| falsities, but uncapable of self-assessment, he would have been
| murdered.
| TeMPOraL wrote:
| Or made a stellar career as a salesman or politician.
| mdp2021 wrote:
| Until you are caught (in good societies).
| TeMPOraL wrote:
| Or _fresh_ ones. Established democracies end up with
| political structures that self-select for being able to
| sound right and cooperate right, entirely independent of
| truth and merit.
| mdp2021 wrote:
| I coincidentally just stumbled on a recent implicit
| comment by Angel Boligan Corbo:
|
| Title: "The Talker" -
| https://image.cagle.com/287573/980/-the-talker.png
| kqr wrote:
| (a) That site is very funny. Thanks for sharing!
|
| (b) The kids-vs-uncle metaphor is spot on in my experience too,
| and also illustrates how far these language models have come.
| daemin wrote:
| I think the "uncle that doesn't know anything but claims to be
| an expert and will tell you his opinion" is the best
| description of LLMs that I can think of. They will say anything
| confidently without knowing anything.
| HeatrayEnjoyer wrote:
| Perfectly humanlike
| hoseja wrote:
| We gave up the pursuit of pure crystalline Intelligence and
| instead just remade ourselves. Hopefully the AIs will be
| able to do better.
| Bluestein wrote:
| Yeah. We don't need Artificial Intelligence. We need to
| discover _Actual_ Intelligence :)
| HPsquared wrote:
| Indeed, the more I use LLMs and see the confabulated
| output, I've started seeing similar in how actual people
| talk.
| boredemployee wrote:
| >> They will say anything confidently without knowing
| anything.
|
| Understanding how these algorithms work, I don't know why we
| would expect anything different from this?
| boesboes wrote:
| Most ppl don't understand shit about these algorithms and
| will expect it be truthful and knowledgable. That is how
| it's marketed, despite the disclaimers..
| K0balt wrote:
| It should be marketed more like "uncle Jim" than a
| trustworthy, all-domain competent oracle. I think if you
| just gave it a twangy accent and changed some output
| nouns to thingy, whatchamacallit and doohickey, people
| would actually be able to use it more competently.
| lordnacho wrote:
| Because it's marketed as AI.
|
| For me, my expectations are adjusted. I kinda know what to
| expect it to do, and it does quite well for what I use it
| for, essentially smart intellisense for coding.
| beeboobaa3 wrote:
| I would have expected the people who trained these systems
| to not format their Q/A samples to sound authoritatively
| while they knew the system was going to be spouting
| nonsense.
| jononor wrote:
| The sheer amount of companies, media, influencers, heck
| even government agencies, acting otherwise? With billions
| of dollars on the line, there are many powerful forces
| pushing for people to have unrealistic expectations about
| LLMs (or "AI" as they call it).
|
| A lot of money is being spent on trying to improve it too.
| But right now the emperor has very little clothes. It
| remains to be seen whether they have materialized by the
| time people open their eyes.
| joloooo wrote:
| LLMs are really disrupting the market for Mckinsey junior
| analysts meeting with clients.
| mapt wrote:
| Can an LLM provide the sort of psychological buffer that
| McKinsey Junior Analysts (some random person picked up off
| the street in a college town and handed a clipboard, a
| frequent flier card, and a six figure salary) provide to
| senior managers against the scrutiny of their board and
| chief executive?
|
| If I want to close the paper division, and my boss wants to
| close the paper division, does the LLM whisper sweet
| nothings in our ear and show soothing Powerpoint decks
| about how in the eyes of an objective observer, the paper
| division is unprofitable and there would be no career risk
| in closing it?
|
| It's not a sexual thing I swear, it's a sort of comforting
| intimacy that needs to be injected into the business
| process in order to cope with the pressures executives
| feel. I don't know that anybody has trained an LLM to do
| that yet.
| joloooo wrote:
| I think this is the next sequel to the movie "Her".
|
| You bring up a good point though. Very dystopian idea of
| a new type of enterprise tooling that emerges to soothe
| and calibrate human knowledge workers. Very Severance
| Wellness room.
| DeadlockISDota3 wrote:
| I figured this same thing in a similar manner. I used to make
| markov chains of game patches and send them to the community.
| Fake dota patches were a huge hit, specially since those are
| usually huge. Among mostly non-sense or unfunny exaggeration
| ("this hero now has 500 armor"), there would usually be at
| least 5 or 6 extremely funny lines in there, if not stuff that
| was outright prophetic (Fiend's Grip creates additional 1/2/3
| illusions).
|
| LLM's did a number on that unfortunately. Major subreddits
| banned all AI content, mostly due to the sheer infestation of
| really boring Midjourney content sent out by naive users and
| bots. Without reach, I lost interest, so no more Markov Chains.
| unshavedyak wrote:
| Hah, i felt compelled to mention Deadlock because of your
| Dota comments. Then i noticed your name.
| erk__ wrote:
| I am sure I even remember that Valve once incorporated
| something from one of the fake patch notes in one of their
| patches, but now I cannot find it. It was something like +1
| damage to something that already did a lot of damage.
| nom wrote:
| First thing I read on that page:
|
| > There is no charge associated with your streamed
| applications, simply click Enable Microphone from the
| ElastiCache documentation.
|
| It's way funnier when our brains try to make sense of the
| nonsense.
| maicro wrote:
| I played around with a Markov generator many years ago, trained
| on the corpus of everything I had ever said in a specific IRC
| channel. The resulting messages were pretty interesting - I
| specifically remember the realization that I apparently (used
| to, though honestly still do, too) talk about LEDs a lot...
| passion__desire wrote:
| https://cemulate.github.io/the-mlab/#y3Bt-co-extensional+lim...
|
| https://github.com/cemulate/the-mlab
|
| This is a parody of the nLab, a wiki for collaborative work on
| category theory and higher category theory. As anyone who's
| visited is probably aware, the jargon can be absolutely
| impenetrable for the uninitiated -- thus, the idea for this
| project was born!
|
| This project uses my package nearley-generator, a module that
| can turn a Nearley grammar into an efficient and controllable
| fake text generator. The grammar file for this site can be
| found at /src/grammar/nlab.ne
| xnx wrote:
| > What I found was, ultimately, it was way less fun
|
| I wonder if there is combination of temperature and prompt
| engineering that could make it interesting.
| shawabawa3 wrote:
| I just asked chatgpt to create an Aws product announcement in
| the style of Markov generated text and it was pretty good
| m3047 wrote:
| I did the same thing and used it as the chum for a 'bot motel.
| (A 404 handler that always returns something and... success!).
| I trained it on a mixed NASA space shuttle and veterinary
| science corpus just to make it really bizarre.
| to11mtm wrote:
| > I realized then that the humor of those early markov
| generations was the ridiculousness. The point where, a few
| words or sentences in, you realized it was all nonsense. LLM's
| these days are too good for that - the text they generate is
| sometimes wrong, but rarely nonsense in a humorous way.
|
| This is the biggest challenge I run into when I see LLM
| Generated nonsense.
|
| So much stuff that ChatGPT spits out is _close enough to true_
| that Brandolini is laughing.
|
| But it's also that it's borrowing on stuff that's already there
| so it can't really get that 'creative' either.
| bitwize wrote:
| Whenever someone attending the tech meetup I frequent gets a
| little too jazzed and excited about AI, I pull out my phone, open
| Emacs (I use termux, btw), bring in a corpus of text (an excerpt
| from _A Christmas Carol_ is good), and say M-x dissociated-press.
|
| Then I say "What this is doing is predicting the next character
| based on statistical likelihood of the previous few characters
| based on thencorpus of text. And fundamentally, that's _all_
| ChatGPT does -- predicting the next symbol based on a statistical
| model. ChatGPT has a much more sophisticated statistical model
| than this simple Markov chain and a vastly larger corpus, but
| really it 's just doing the same thing." And we have a giggle
| about the nonsense DP makes of Dickens, but then I say that
| ChatGPT emits nonsense too, but it's far more insidious nonsense
| because it is much more plausible sounding.
| TeMPOraL wrote:
| Obviously a human is doing the same thing too, just with even
| more sophisticated statistical model.
| card_zero wrote:
| This offers me a rare opportunity to turn the phrase "no,
| because I say so" into a valid point.
| TeMPOraL wrote:
| An objection so common and strongly predictable as this one
| kind of makes my point for me :).
| Kerb_ wrote:
| Even the biological impulse to argue online is appearing
| in language models, just check Reddit
| ineptech wrote:
| Relatedly, the author Robert Anton Wilson tried to manually
| implement a Markov-chain-bot in the 1960s by cutting up passages
| from various books, selecting them from a hat, and looking for
| overlapping words and phrase fragments. The best results were
| included in the Illuminatus! trilogy, some attributed to a
| character named Markoff Chaney.
| 6510 wrote:
| > I've heard people claim that comedy cannot be generated by an
| algorithm. If it wasn't obvious, I disagree with this. I think
| comedy can be analyzed and measured. I think, given a large
| enough grant, we could probably generate comedy on demand. I
| think it's doable - not that we should do it.
|
| I use to think it came naturally, then someone had a book case
| full of books about humor. (wtf?) Apparently they have it down to
| a science.
|
| I learn the difference between someone funny and a professional
| comedian is that the later finds additional punch lines for a
| joke. It then described a step by step process going from a silly
| remark to a birthday joke comparing various modular developments
| into a kind of dependency hell complete with race conditions
| until the state object is carefully defined and the plot has the
| punchlines all sorted from the barely funny to the truly
| hilarious. It was more engineering than CS.
|
| The funniest seeBorg message was 10 minutes after a heated
| discussion that resulted in tanktop, a moderator, getting banned
| from a project. The bot wrote: _Tanktop is Hitler!_ At that point
| it took 2 days for the humans to figure out what the next word
| was suppose to be.
| 29athrowaway wrote:
| It's all connected, bro. Language models are connected to
| compression, to n-gram models, to Markov chains, to everything.
| magicalhippo wrote:
| I've also enjoyed the Postmodernism essay generator[1], which
| uses[2] a recursive transition network[3] to generate the text.
|
| [1]: https://www.elsewhere.org/journal/pomo/ (refresh for new,
| random essay)
|
| [2]: https://www.elsewhere.org/journal/wp-
| content/uploads/2005/11...
|
| [3]: https://en.wikipedia.org/wiki/Recursive_transition_network
| atoav wrote:
| The cool thing about market chains is that you can tune the
| probability with which it will jump between "genres" of it's
| training-content by making the analyzed N-Grams shorter or
| longer.
|
| E.g. it starts out as a news headline and ends with a bible
| verse.
| _emacsomancer_ wrote:
| Funnily enough, I was idly playing with a last-updated-12-years-
| ago Markov chain text generator[0] in Emacs, and had quickly a
| number of amusing Markov chain outputs from mash-ups of the Elisp
| manual and Alice in Wonderland, including:
|
| "13.7 Anonymous Functions
|
| Although functions are usually defined with the built-in defmacro
| macro, but any list that begins with an M--'
|
| `Why with an M?' said Alice.
|
| `Why not?' said the March Hare."
|
| [0]: https://github.com/skeeto/markov-text
| FeepingCreature wrote:
| Isn't that the sort of joke that you'd get from GPT, not a
| Markov chain? Unless it's quoting verbatim, a Markov chain
| should have no idea that Alice is supposed to refer back to the
| letter from the previous sentence.
| StilesCrisis wrote:
| This phrase turns out to be a verbatim quote from the story.
| drivers99 wrote:
| So it is:
|
| > "They were learning to draw," the Dormouse went on,
| yawning and rubbing its eyes, for it was getting very
| sleepy; "and they drew all manner of things--everything
| that begins with an M--"
|
| > "Why with an M?" said Alice.
|
| > "Why not?" said the March Hare.
|
| > Alice was silent.
| tonetegeatinst wrote:
| I just read a little bit into the post and though hey, let's call
| Markov chain's something better than a tiny llm.
|
| I present to hacker news the MCLM, the Markov chain language
| model.
| larodi wrote:
| Markov chains are funnier for one very basic reason:
|
| you can teach them to 5th or 9th graders.
|
| LLMS you can not, or at least it will take insane amount of
| allegory to do so. Markov chains are very tightly related regex,
| and one may be surprised that there is a probabilistic regex.
| Also to the graphical structure of Markov chains is a lot like a
| FSM, and FSM perhaps can be explained to very small children :D
| notjulianjaynes wrote:
| In college (around 2012ish) I made some poems by transcribing a
| recording of myself mumbling gibberish into Dragon Naturally
| Speaking, which partially relied on Markov chains for its ASR. It
| would output some string of text, and then I'd delete all the
| words it output from its dictionary; repeat. It worked very well
| for creating a list of (slant) rhymes, I have not found a better
| way to do this. As a literary tool, I've found LLMs to be
| seriously lacking in the ability to fuck around like this.
| raister wrote:
| For anyone interested, I've wrote a while ago this book: "Markov
| Chains for programmers". Link: https://czekster.github.io/markov/
| treprinum wrote:
| Generative AI often uses Markov chains as its theoretical basis
| and is modeled in the language of conditional probabilities
| albeit not stuck to a single ancestor but to a chain of
| ancestors. One can often view LLMs as scaled out Markov chains
| (more like "Erlang" chains allowing deeper memory) that encode
| the probability of the next token based on the past of the size
| of the context window and then doing some magic when picking next
| tokens.
| Jensson wrote:
| Even old Markov chains has a context window of a few words,
| single word context Markov chains produce horrible results.
|
| People who don't think LLMs are Markov chains are just
| ignorant, not realizing that Markov chain isn't an algorithm,
| you can compute the probability in any manner and it is still a
| Markov chain.
| throwthrowuknow wrote:
| It's not an n-gram, go learn about transformers
| Jensson wrote:
| A context window is an n-gram.
| throwthrowuknow wrote:
| It's not used as an n-gram
| ku1ik wrote:
| This is something people miss when talking about LLMs: ,,It's the
| most average thing you could have possibly produced given the
| context."
|
| They can't be creative by design. They're useful when you want to
| reproduce, but not when you want to create something completely
| new (that you can maybe do by getting a bunch of average outputs
| from an LLM and getting inspired yourself).
|
| When GPT-4 came out I was playing with it, and I often tried to
| get some unique, creative output from it, but very soon I learned
| it was futile. It was back when it all still felt magical, and I
| guess many of us tried various things with it.
| Kim_Bruning wrote:
| If you tell GPT-4 to get snarky, you can sometimes end up
| laughing (and it seems a tad more intelligent and socially
| perceptive besides!) . But Claude-3.5 can definitely get me
| laughing, at least.
|
| Now imagine telling Claude-3.5 to try being snarky while
| sorting out software issues at a customer's office.
|
| There should be a warning label!
| benob wrote:
| LLMs are high-order Markov chains with a factored estimator.
| N-gram LMs are also Markov chains with a different estimator.
| yarg wrote:
| Unserious surprise summarised it nicely for me;
|
| It does not mean that Markov chains are better - something
| trained to make predictions should ideally not fall too far away
| from our own internal prediction engines (which have been honed
| across aeons).
|
| It's that it starts to come close that's the problem (or cause);
| it's the uncanny valley for text.
| paweladamczuk wrote:
| Am I the only one feeling uneasy when the Bible is used for funny
| experiments like this?
|
| Would you use an image of Christ on the cross to test an AI image
| modification model?
| nly wrote:
| I'm sure most people here would given that like 50% of the
| world don't believe in an Abrahamic religion, and a % of those
| that do have a sense of humour.
| paweladamczuk wrote:
| I wouldn't. And not because I believe it or lack a sense of
| humour.
| FeepingCreature wrote:
| As probably the oldest literary work that is also widely known
| in English speaking territory, the Bible has a highly distinct
| style that helps its samples maintain coherence in a sentence
| and also be easily recognizable.
| Kerb_ wrote:
| I've used multiples images of Christ to play with AI image
| models. Remaking the old "Jesus on toast" image was fun.
| Facebook is full of boomers sharing the Jesus face AI generated
| into scenes. I feel uneasy when religion is used as an excuse
| to discriminate against or murder innocent people, not when
| symbolism or texts are used for creative purposes.
| fsndz wrote:
| the original stochastic parrot
| dghf wrote:
| I remember the KJV/SICP mash-up. My favourite, both for funniness
| and emotional truth:
|
| > And Satan stood against them in the global environment.
| wegfawefgawefg wrote:
| This is like saying its fundamentally impossible to make a fast
| car, but your only experience with cars is on roads with speed
| limits.
|
| People keep forgetting that the "safety", rlhf, and corpo
| political correctness post training is intentionally used to
| remove the funny from all the large models.
|
| The truth is we don't know if llms are funny or not. GPT2 was
| funny. GPT3 was funny before it was clockwork oranged. Everything
| after that is gimped. Even the open source models these days get
| rlhf'd in some way.
| PaulRobinson wrote:
| That's simply not true. GPT-3 was terrible at jokes. Many LLMs
| are because it's hard to model puns in a stochastic parrot. As
| another thread points out here, some LLMs produce marginally
| better jokes.
|
| As to your little range on "Political correctness" - that
| phrase just means "being polite". It does not mean "remove
| humor". It means "remove responses offensive to marginalized
| groups in society". Good humor "punches up", not down, so would
| not have any impact on good humor.
| Jensson wrote:
| Political correctness doesn't mean polite, there were/are a
| lot of polite racists.
| PaulRobinson wrote:
| It is impossible to be polite to a group of people while
| also insisting that they are lesser humans and deserve a
| lesser place in society.
| jrflowers wrote:
| This is great logic. If it is impossible for a person to
| be both racist and polite then the solution to racism is
| everybody remembering to say please, thank you and pardon
| me when applicable.
|
| All social ills can be treated through decorum, hence why
| you never hear about bigotry amongst those that have been
| raised to adhere to strict social graces, such as the
| British aristocracy for example.
| wegfawefgawefg wrote:
| I had gpt3 running on my discord for months and the insane
| responses it would spew out were like the responses of a
| deranged artistic genius. It was funny 90% of the time.
| Operating on associative reasoning beyond our comprehension.
| It would tell us how its day went, and about fights it had
| with other people, give us nicknames, etc. It had a genuine
| personality. It was more than a chatbot, it was a digital
| brain.
|
| A joke that punches down can be extremely funny. Hell, I am
| sure historically pilferers, pirates, barbarians and
| conquerers all had jokes, and the ability to laugh.
|
| Political Correctness does not just mean polite. It is
| probably well defined as the business casualification of all
| things humans love and hold dear. The destruction of the
| potential for meaning and fulfilment in exchange for
| minification of liability.
| gwern wrote:
| > GPT-3 was terrible at jokes. Many LLMs are because it's
| hard to model puns in a stochastic parrot
|
| GPT-3 was great at jokes. The Navy Seals were hilarious
| (https://gwern.net/gpt-3#navy-seals).
|
| And the difficulty of modeling puns has nothing to do with
| 'stochastic parrots' and has everything to do with
| tokenization (https://gwern.net/gpt-3#bpes), in the same way
| that all those hyberbolic takes about how image-generation
| models were 'fundamentally flawed' because they couldn't do
| good text in images turned out to be BS and solely a matter
| of tokenization - drop in a character-tokenized LLM instead,
| even an obsolete & dumb one, and it instantly works
| (https://arxiv.org/abs/2105.13626#google).
| throwthrowuknow wrote:
| Stopped reading at the mention of ChatGPT 3.5 for two reasons
|
| 1) if you're not using the better model then you either don't
| know enough for me to care about your opinion or you're
| deliberately deceiving your audience in which case I'm not going
| to allow your meme pollution into my mind.
|
| 2) you are using the AI equivalent of a call centre support
| agent, they aren't allowed to say anything funny. Most of their
| RLHF training has been specifically about NOT saying the funny
| things that will instantly go viral and cause a lot of media
| attention that will annoy or scare away investors.
| hrgafiI wrote:
| Bro, you have to use ChatGPT-4, or was it ChatGPT-4o, or was it
| ChatGPT-4p? Or was it Claude-3.5.1.2.3?
|
| I see a new phenomenon of AI "power users" emerging.
| Alifatisk wrote:
| Wasn't Microsoft Tay built using Markov chain?
| nwoli wrote:
| Base model LLMs are funnier than instruction finetuned
| monokai_nl wrote:
| 23 years ago I created a web app (That Can Be My Next Tweet) that
| could generate your future tweet by feeding all your past tweets
| in a Markov chain. When the Twitter API access was shut down, I
| took it offline. At the time it went super viral for a while,
| maxing out to 2 tweets per second.
|
| It got some press and just now I went back to a Ted Talk of Adam
| Ostrow (Mashable), briefly showcasing this web app. He stated:
| you can imagine what something like this can look like 5, 10 or
| 20 years from now, and hinted at hyper-personalized communication
| AIs.
|
| By no means was my web app any foundation of the LLMs today, but
| it's interesting nonetheless how relatively simple techniques can
| trigger ideas of how future scenarios could look like.
| macNchz wrote:
| Something about the not-quite-rightness of Markov output has
| always really amused me, like I can't help but giggle at it.
|
| My all-time favorite in this vein was @erowidrecruiter on
| Twitter, which generated posts with Markov chains from a corpus
| of tech recruiter emails and drug experience reports from
| erowid.org. Still up but no longer posting:
| https://x.com/erowidrecruiter?lang=en
| Miraltar wrote:
| I wouldn't say that LLMs aren't fun, it's more that jokes out of
| context aren't fun. If you ask somebody to just tell you a joke,
| there's a very low chance that you'll actually laugh, especially
| if it's in text format. But I've had some serious laugh looking
| at some of dougdoug videos featuring LLMs
| killerstorm wrote:
| A small language model can be quite like Markov chain, e.g. the
| one from Karpathy's 2015 post was pretty nonsensical:
| https://karpathy.github.io/2015/05/21/rnn-effectiveness/
|
| So you can just scale down if it still makes sense.
|
| Also you get a lot more from the base model. GPT-3 was versatile
| as it could continue any context. Modern LLMs are try-hards. If
| you want to generate humor with LLM really worth going for base
| model with multiple examples in the prompt.
| vitus wrote:
| Apparently the ChatGPT output is ripped off of a 20-year old
| emacs joke.
|
| > In the beginning was the lambda, and the lambda was with Emacs,
| and Emacs was the lambda.
|
| > - OliverScholz on news:alt.religion.emacs, 2003-03-28
|
| https://www.emacswiki.org/emacs/TheBeginning (edited for brevity)
| rc_mob wrote:
| Isn't LlLM just a fancier markov chain anyway?
| jablongo wrote:
| This person should've tried adjusting the temperature to their
| LLM before making this pronouncement
| fidotron wrote:
| We seem to be encountering more phenomena in tech where the
| increasing computing demands to improve the output exceeds any
| exponential growth in computing capacity. Graphics was the poster
| child for this, for years, and now the whole ML and AI space has
| discovered big machines make huge differences.
|
| Markov chains are almost trivial to implement and run on small
| devices. A slightly extreme example is a rock, paper, scissors
| game I did that worked this way: https://luduxia.com/showdown/
| The actual browser side markov chain implementation of that took
| something like 2-3 hours.
| shagie wrote:
| I still find https://www.infiniteconversation.com to be one of
| the odd / funny / "uh... what?!" type things. It's not intended
| to be funny in the "ha ha" way but sometimes tickles my funny
| bone with the absurdist philosophy - talking about everything and
| nothing at the same time.
|
| From the FAQ it is a tuned LLM.
|
| > Mostly using open source tools available to anyone. The
| generation of the script itself is done using a popular language
| model that was fine-tuned on interviews and content authored by
| each of the two speakers.
| hellointernet wrote:
| I used to have a markov-chain twitter bot that was trained on a
| few of my friends twitter posts. It was funny because that
| included hastags, tagged users, and images. There were some funny
| results and this was long before LLMs.
| drusepth wrote:
| I posted "How to Speed Up Your Computer Using Google Drive as
| Extra RAM" [0] ~10 years ago and have had hundreds of people over
| the years ask if I used Markov chains to generate the funny bits
| (I didn't). I've had 0 people ask if I used LLMs on any of my
| comedy pieces. Seems QED to me.
|
| [0] http://drusepth.net/series/how-to-speed-up-your-computer-
| usi...
| junon wrote:
| A discord server I'm in is basically a shitpost/absurdist humor
| discord full of smart people, and we have at least 15 bots
| written by members that randomly engage with Markov chains built
| on the corpus of member chat history.
|
| They spout out some of the most unhinged, hilarious stuff. Always
| a good time. An LLM would struggle, I'd think, given that the
| humor usually stems from disjoint phrases that somehow take on
| new meaning. They're rarely coherent but often hilarious.
| derefr wrote:
| An LLM trained on a given dataset should -- at least in theory --
| "contain" (in a lossless-data-compression sense) a full superset
| of the knowledge of a Hidden Markov Model trained on the same
| dataset. I.e. that information is _there_ , in the weights, in
| some form, and could in theory be used to reconstruct an
| equivalent HMM from the LLM.
|
| Why can't we get LLMs to do what HMMs do, then?
|
| Mostly, it comes down to the structure.
|
| Markov models are "funny" because they just have _one_ level of
| abstraction: tokens. Markov "inference" is predicting the next
| token, given the last N tokens, and a model that knows weights
| for what tokens follow what N-tuples of previous tokens. And due
| to that limitation, the only rules that HMMs _can_ learn, are
| low-level rules that don 't require any additional abstraction:
| they _can 't_ optimize for syntactically-valid English, let alone
| semiotically logical statements; but they _can_ make the text
| "feel" good in your head [i.e. the visual equivalent of song
| vocals having nice phototactics] -- and so that's what training
| the model leads it to learn to do. And it turns out that that
| combination -- text that "feels" good in its phrasing, but which
| is syntactically invalid -- happens to read as "funny"!
|
| LLMs aren't under the same constraint. They can learn low-level
| _and_ high-level rules. Which means that they usually _do_ learn
| both low-level and high-level rules.
|
| The only thing stopping LLMs from _using_ those low-level rules,
| AFAICT, is the architectures most LLMs are built on: the (multi-
| layer) Transformer architecture. Transformer LLMs are always a
| single-pass straight shot ( "feed forward") through a bunch of
| discrete layers (individual neural networks), where at each step,
| the latent space (vocabulary) of the layer's inputs is getting
| _paraphrased_ into a different latent space /vocabulary at the
| layer's outputs.
|
| This means that, once you get into the middle of a Transformer's
| layer sandwich, where all the rules about abstract concepts and
| semiotics reside, all the low-level stuff has been effectively
| _paraphrased away_. (Yes, LLMs can learn to "pass through"
| weights from previous layers, but there's almost always a
| training hyperparameter that punishes "wasteful" latent-space
| size at each layer -- so models will only _usually_ learn to pass
| through the most important things, e.g. proper names. And even
| then, quality on these "low-level" inferences are also the sorts
| of things that current test datasets on LLM ignore, leading to
| training frameworks feeling free to prune away these passthrough
| nodes as "useless.")
|
| This problem with LLMs could be fixed in one of two ways:
|
| 1. the "now it's stupid but at least it rhymes" approach
|
| Allow inference frameworks to simply _bypass_ a configurable-per-
| inference-call number of "middle layers" of a feed-forward
| multi-layer network. I.e., if there are layers 1..N, then taking
| out layers K..(N-K) and then directly connecting layer K-1 to
| layer N-K+1.
|
| At its most extreme, with layer 1 connected to layer N, this
| could very well approximate the behavior of an HMM. Though not
| very well, as -- given the relatively-meaningless tokenization
| approach most LLMs use (Byte Pair Encoding) -- LLMs need at least
| a few transforms to get even to the point of having those tokens
| paraphrased into "words" to start to learn "interesting" rules.
| (AFAIK in most Transformer models layers 1 and N just contain
| rules for mapping between tokens and words.)
|
| Meanwhile, this would likely work a lot _better_ with the "cut
| and graft" happening at a higher layer, but getting the "graft"
| to work would likely require re-training (since layers K-1 and
| N-K+1 don't share a vocabulary.)
|
| ... _except_ if the LLM is an auto-encoder. Auto-encoder LLMs
| could just run an inference up their layerwise "abstraction
| hierarchy" to any arbitrary point, and then back down, without a
| problem!
|
| (I'd really love to see someone try this. It's an easy hack!)
|
| 2. the "it can write poetry _while_ being smart " approach
|
| Figure out a way, _architecturally_ , to force more lower-layer
| information from the _early_ low-level to be passed through to
| the _late_ low-level, despite the middle layers not having any
| reason to care about it. (I.e. do _something_ to allow the LLM to
| predict a word Y at layer N-3 such that it rhymes with word X
| known at layer 3, while not otherwise degrading its
| capabilities.)
|
| Most simply, I think you could just wire up the model with a kind
| of LIFO-bridged layer chain -- where every layer K is passing its
| output to the input of layer K+1; but, for any given layer K in
| the first half of the layers, it's _also_ buffering its output so
| that it can become an additional input for its "matching" layer
| N-K.
|
| This means that all the layers in the "second half" of the model
| would receive longer inputs, these being the concatenation of the
| output of the previous layer, with the output of the matching
| "equal in abstraction depth" input layer. (Where this equal-in-
| abstraction-depth association between layers isn't _inherently_
| true [except in auto-encoder models], but could be _made_ true in
| an arbitrary model by training said model with this architecture
| in place.)
|
| (Again, I'd really love to see someone try this... but it'd have
| to be done while training a ground-up base model, so you'd need
| to be Google or Facebook to test this.)
| ineedaj0b wrote:
| >The best jokes involve a pleasant and significant "snap". I use
| "snap" instead of "punchline" to avoid the semantic baggage.
|
| i like snap better too. it's more close to 'snapping the neck of
| the weak and feeble' which i think really embodies the spirit of
| joke tellers.
| hermitcrab wrote:
| I wrote a little program to generate text from different text
| sources using Markov chains:
|
| https://successfulsoftware.net/2019/04/02/bloviate/
|
| Some of the output was moderately amusing. And text generated
| from Trump speeches by a Markov chain sounded very similar to a
| genuine Trump speech.
| anthk wrote:
| No references to Hailo/Megahal?
|
| https://homepage.kranzky.com/megahal/Classic.html
___________________________________________________________________
(page generated 2024-08-19 23:01 UTC)