[HN Gopher] Show HN: Penny-1.7B Irish Penny Journal style transfer
___________________________________________________________________
Show HN: Penny-1.7B Irish Penny Journal style transfer
Yesterday, in the bygone hour of the weekend, I undertook a most
singular and fascinating endeavor, wherein I delved deep into the
recesses of my mind, and, with a fervent zeal, breathed life into a
most remarkable creation. I embarked upon the quest, with the
singular object of fashioning an artificial construct, one imbued
with the verdant essence of the Irish Penny Journal, an ancient and
venerable tome that holds within its pages the whispered tales of a
bygone era. In my haste, I set forth to construct a dataset, a
repository of those fleeting moments, these ephemeral sentences,
which spoke of a bygone age. I procured a collection of these
fleeting moments, these sentences, and with them, I synthetically
conjured forth modern translations, an ingenious feat of
substitution, which allowed my artificial construct to take on the
guise of the language of the Irish Penny Journal. Then, with great
anticipation, I fashioned a small encoder, a humble instrument,
with which to guide the artificial construct in its endeavors. I
presented this encoder as a bribe, a reward, to a most ingenious
system, one that trained a colossal language model, one of
unbridled potential, one that was capable of weaving tales with the
very essence of the Irish Penny Journal. And lo! In the succeeding
moments of time, I witnessed a most wondrous thing. My artificial
construct, armed with this training, and guided by the whispers of
the encoder, began to speak, to speak in the language of the Irish
Penny Journal. The words it spoke were, indeed, the words of the
past, imbued with the nostalgia of a forgotten era. And thus, my
friends, I have witnessed a most singular creation, one which
embodies the language of the past, yet, in its most recent
iteration, speaks to the present. A testament to the ingenuity of
the human spirit, this artificial construct speaks of the bygone
era, yet, with each word, it whispers to us, to us, of a future yet
to come. ---- That's Penny explaining itself to you. This was
trained using GRPO only, in less than a day using a single A6000. I
didn't use any SFT, and only relied on a small encoder (MiniLM2)
trained to classify texts from the Irish Penny Journal and their
modern translations (synthetically produced).
Author : deepsquirrelnet
Score : 120 points
Date : 2025-06-02 15:49 UTC (7 hours ago)
(HTM) web link (huggingface.co)
(TXT) w3m dump (huggingface.co)
| ekianjo wrote:
| Nice work ! It still manage to use the word 'delve' in the first
| sentence, which is a giveaway that it's written by a LLM.
| deepsquirrelnet wrote:
| Haha! You're right. Maybe I'll add a penalty for that and some
| other giveaway words and make a revision.
| dragonwriter wrote:
| Its considered an LLM tell because its a term that is rarely
| used by the median modern casual writer; its not at all
| uncommon even in current literature [0], and its even more
| common in older literature, so complaining about it in a model
| designed to reproduce a particular style of 19th century print
| literature is silly in the extreme.
|
| [0] many "LLM tells" fit this pattern of just being common
| features of professionally-published works that are less often
| seen in casual writing.
| observationist wrote:
| It's very common in fantasy novels - dwarves and wizards do a
| lot of delving into caves, dungeons, and towers. It's also a
| solid academic term, so scientists delve into a lot of
| subjects, and brain people do a lot of delving into psyches.
|
| LLMs are raising the bar by expanding the vocabulary people
| are exposed to, so words like delve will stick out. I think
| it's preferred by writers because it articulates a nice
| sounding alternative to words like explore, venture, analyze,
| think about, confront, etc. It's a useful, versatile word,
| and one of the metrics by which writers measure quality is
| the minimization of syllables.
|
| LLMs are mostly indistinguishable from humans at this point;
| a one-shot output from any of the major models can be
| recognized in the same way you might recognize a writer. With
| multiple style passes, you're not going to be able to tell
| the difference between ChatGPT, Ronald Reagan, Bill Clinton,
| Hunter S. Thompson, Einstein, or any other sufficiently
| modeled figure. Throw in a few tens of thousands of words
| written by yourself and most of the models will do a nearly
| flawless job of copying your stylometric profile.
| bee_rider wrote:
| Delving also has the implication of going deep, while
| exploring has the implication of going wide. I wonder if
| human authors really do a good job of picking the right
| word there. Either way I wonder if "misused delve" could be
| an interesting signal.
| observationist wrote:
| It's being looked at as an AI signal, which is causing
| labs to artificially suppress it, so it may end up being
| a "human-authored" signal in the end. Give it a year or
| two, though, and we're looking at AI with superhuman word
| choice. There will be dozens of layers of introspection
| underlying the selection of each word, in a broad
| context, articulating exactly whatever the user wants.
| The philosophical and psychological implications of
| superhuman text generation, beyond the p(doom)
| discussions, gets crazy. Superhuman persuasion is one
| facet, but unintended manipulation through reinforcement
| of secondary and peripheral notions in the context create
| all sorts of weirdness.
|
| Language communicates ideas, and we've made machines that
| produce intricate, sophisticated ideas that land in our
| brains. The consequences are going to be fascinating.
| freedomben wrote:
| Indeed. It's because of this that I have serious concerns
| about what "AI" is doing to our writing. My son has been
| flagged for using "AI" in writing his papers because it
| sounded "too good." But, I know he wrote it. I've run into
| the same issue where people have suggested something I wrote
| was AI because of certain vocabulary words that I've used.
| The only "defense" against this is to write a little bit
| shitty, because then there is no suspicion. If everybody
| starts doing that, especially in academic settings, that's a
| road to serious sadness IMHO.
| bee_rider wrote:
| We're in a transitional stage at the moment, so I'm not
| _too_ worried about maladaptive shitty writing becoming the
| norm (although, it sucks that your kid is being punished
| for our failure to adapt).
|
| It would be totally nuts to take points off for using spell
| check. An LLM should be able to provide style check without
| causing any concerns; it will become the norm, and then too
| good prose won't cause any flags to be thrown.
| tiahura wrote:
| I delve into facts and case law frequently.
| arscan wrote:
| FWIW, the 19th century style dataset this is trained on
| doesn't seem to have any examples of delve [1], with the
| exception of:
|
| > It was into that bank that the creature had worked its way,
| and on listening I could hear it delving and scraping at a
| great rate, about a yard from the back of the wall.
|
| I bring that up to point out that this isn't necessarily
| (more) common in the 19th century style print literature, so
| the observation might not be silly. The model creating the
| modern synthetic version injected 'delve' 9 times, which
| implies that it is more frequently used in modern literature
| or just something that models tend to inject. Though, I could
| be missing something (either in searching the data set, or
| how this works).
|
| [1] https://huggingface.co/datasets/dleemiller/irish_penny_jo
| urn...
| secondcoming wrote:
| What's the issue with 'delve'? It's hardly Old English
| esafak wrote:
| It's "lexically overrepresented".
| https://arxiv.org/abs/2412.11385
| numpad0 wrote:
| The standard narrative is that chatbot training efforts were
| all outsourced to poorest regions of Africa with best English
| fluency, and it turned out they(both their child LLM and the
| poor guys who trained them) use the word "delve" a lot.
| Suppafly wrote:
| I don't know anything about Africa, but I assumed it's
| because dictionaries define it to mean "to make a careful
| or detailed search for information" and "to examine a
| subject in detail" and LLMs seem to latch on to using it
| for that for some reason.
|
| If anything, I'd assume the chatbots would use Indian
| English phrases like "do the needful" and those weird
| phrases that only make sense in Hindi but are translated to
| English.
| Suppafly wrote:
| >What's the issue with 'delve'? It's hardly Old English
|
| It's used a ton by LLMs for some reason despite being rarely
| used by real people. I think it's mostly a byproduct of LLMs
| having their training data being over represented by certain
| published works instead of casual communications.
|
| There does seem to be something else going on with delve
| specifically though, one of the other comments mentions that
| delve isn't used in the specific training data for this, so
| it's odd to see it being used in the output. I wonder if it's
| because delve has secondary definitions of "to make a careful
| or detailed search for information" and "to examine a subject
| in detail" which is causing the LLM to use it to seem like
| it's answers are more thorough.
| jsheard wrote:
| > It's used a ton by LLMs for some reason despite being
| rarely used by real people.
|
| The popular theory is that it's due to English-language
| RLHF tasks being outsourced to Nigeria, where "delve" is
| used relatively often.
|
| https://simonwillison.net/2024/Apr/18/delve/
| Suppafly wrote:
| I keep seeing that proposed as a theory, but are
| companies actually outsourcing this work to Nigeria or is
| that just an assumption someone made because they know
| delve is more commonly used there? A lot of these posts
| can't even decide which African country to blame it on.
| I've heard Kenya and Nigeria and just 'Africa' in
| general.
| rndmio wrote:
| The repetition of adjectives in adjacent sentences is much more
| of a tell than using the word delve, imo.
| rsynnott wrote:
| Eh, it's not a _great_ one, because it's also common in very
| bad human writing, particularly fiction.
| deepsquirrelnet wrote:
| This is a good call-out, and one that might be fixable by
| setting a repetition penalty in generation -- or better in
| training.
| w10-1 wrote:
| I recall in 1980's, style-analyzing software reporting word
| counts for words with high frequency in the text but low
| frequency in general usage - jargon. The suggestion was to
| re-word, and it was often a relevant clue to re-think by
| examining why one word was doing so much work. Detecting
| that sounds like more than repetition, but still feasible
| probabilistically and relevant at both usage and conceptual
| levels.
| mbStavola wrote:
| I seriously hate the supposed "tells" for LLM writing. Can't
| use delve, em dash is suspect, never get a fact wrong, and God
| forbid your sentence structure is more complex than what a 3rd
| grader is capable of.
|
| I think we're well beyond the point where the majority of
| people cannot tell what is actually produced by an LLM and
| we're convincing ourselves we still have a handle on the
| situation. A lot of these rules are completely arbitrary and
| vary from person to person.
| Der_Einzige wrote:
| Skullface sends his regards:
|
| https://arxiv.org/abs/2409.01754
|
| https://youtu.be/sy4SwW0QkoA
| jerjerjer wrote:
| Bullet points are taboo now.
| Der_Einzige wrote:
| This is easily fixed with the antislop sampler.
| https://github.com/sam-paech/antislop-sampler
| officeplant wrote:
| It's becoming fairly upsetting to see words I like using end up
| on lists of AI identifiers.
|
| RIP kids who grew up digesting hundreds of fantasy novels and
| playing D&D.
| veggieroll wrote:
| Have you written anywhere in detail on how you gathered your
| dataset and trained the finetune? I have a few use cases that are
| like this, but I'm not sure where to start.
| deepsquirrelnet wrote:
| My dataset is here:
| https://huggingface.co/datasets/dleemiller/irish_penny_journ...
|
| It's fairly simple -- I essentially just split the original
| text into chunks and then used some bigger models on openrouter
| to clean it up and provide translations to modern English
| (seemed to be pretty easy for an LLM).
|
| After that, I just trained a MiniLM2 model to classify the
| texts. I used this in a reward function for reinforcement
| learning and changed the system message as a simple instruction
| to write in the prose of the IPJ.
|
| I debated whether or not to use any SFT, and decided not to. I
| think if the style would be too hard to learn you might need
| some seed/cold start SFT data.
|
| I'll try to get my scripts up in github for you to look at.
| It's just a few short training scripts.
| veggieroll wrote:
| Thanks for the explanation! I'm learning and and think this
| would be a good next project for me to try, especially since
| I have a real world use case in mind with a similar amount of
| data available.
|
| In particular, I'm not very familiar with reinforcement
| learning and am not sure how you use the embeddings from
| MiniLM2 as a reward function. (Edit: maybe this is the
| jaccard similarity?)
|
| I'd really appreciate it if you were open to posting scripts!
| I see a few snippets around and could probably cobble
| something together after a while. But, it's cool to see
| something already working to make sure I'm not getting too
| far off into left field.
| deepsquirrelnet wrote:
| You can ignore the jaccard similarity field. That was just
| to monitor the text->cleaned text conversion to make sure
| it didn't stray too far from the original while it was
| fixing whitespace OCR issues.
|
| I didn't use embeddings. Nreimers account on huggingface
| has the minilm models which are BERT-like, but trained
| using distillation. https://huggingface.co/nreimers/MiniLMv
| 2-L6-H384-distilled-f... Is the one I started from.
|
| You can then just load that and train it on your data using
| a standard transformers classification pipeline. ChatGPT
| can zero shot that part reasonably well if you gave it this
| description.
|
| From there you should check out the GRPO trainer in TRL. It
| has taken me a bit of time to learn how to use it
| effectively. There's a TON of parameters in the
| configuration, and occasionally I have to hunt down arxiv
| papers to understand them.
| veggieroll wrote:
| Ah! That makes more sense. Thank you for clarifying.
| Great to see you got so much activity on your HN thread
| here.
| bee_rider wrote:
| It is sort of funny that the Irish ended up being the best
| practitioners of the English language, despite the fact that they
| were forced to use it.
| blululu wrote:
| Not sure this is true. Most of the famous Irish writers were
| Anglo-Irish Protestants (Yeats, Wilde, Swift, Beckett). Joyce
| is the notable exception here. The Irish certainly produce
| great cultural works of the English language (well beyond their
| size). But also the penal laws greatly depressed the cultural
| output of the Irish people for 250 years.
| projektfu wrote:
| I feel that the "best practitioners" is not limited to the
| most famous writers. A great thing about Ireland is the
| conversations to be had there, and how quick-witted Irish
| people often are, with clever use of the language. This can
| be true elsewhere in the English-speaking world, but Ireland
| has some renown for it.
| w10-1 wrote:
| Joseph Conrad also had English as a second language.
| _1 wrote:
| Kinda of strange to pick an example that is just wrong. It's
| supposed to be written from 1840 and says Paris is the seat of
| Napoleon almost 20 years after he died.
| Philpax wrote:
| It's transferring the style, not the knowledge.
| sjkoelle wrote:
| Marvelous! What gain beyond zero-shot would motivate a humble
| citizen to implement this instrument? How was the superiority
| assessed?
| deepsquirrelnet wrote:
| Good question - my best assessment is just the text classifier.
| IE was the LLM able to "trick" the classifier into believing
| the text came from the IPJ?
|
| And it came quite a long way in training. Initially the
| classifier scores were very low (mean around 0.05, meaning
| _modern_ ). Over training, the scores came up and ended close
| to 0.95 (IPJ). The standard deviation of the group also
| declined, so the consistency of responses improved as well.
|
| My thought on the application of this is that you could use it
| to create different voices to your responses and probably even
| add multiple at a time to a single model. I chose this one to
| experiment, because it is easy to classify and the data was
| available in the public domain.
|
| GRPO kind of opens up RL to lower tiers of hardware and I've
| been able to experiment with it at home. I think this is
| something people can do themselves and it's fun and potentially
| useful in games or possibly in relation to applications
| interfacing kids with lower reading levels (eg using a reading
| level classifier instead).
| dwringer wrote:
| Yet, one might justly question the imperative of cultivating
| a distinct model for such an endeavour, when a judiciously
| framed prompt, enriched by apposite examples, might suffice
| to imbue a sophisticated engine with the desired stylistic
| graces. Though it is undeniable these modern engines shall
| wax greatly in their proportions, and the art of discovering
| the exact prompt to elicit their most felicitous expressions
| is a task far from trivial, yet, it must be admitted, the
| pursuit holds a certain diversion for the inquisitive mind!
| It is, perchance, not the creation of manifold engines, but
| rather the artful disposition of singular contexts, that
| shall bestow upon diverse interlocutors their proper and
| unique voices.
| kamranjon wrote:
| This is really cool! Do you have any of the pipeline code
| available that you used for training? I am curious about how you
| created the reward model. I love little projects like this,
| thanks for sharing. I've been fine-tuning on my mac and an
| interested in getting into GRPO, which I haven't tried yet.
| latchkey wrote:
| Reminds me of this: https://www.unix.com/man_page/debian/6/jive/
| joshstrange wrote:
| Now I'm just imagining a video game with characters each having
| their own fine tune applied on top for their dialog. I'm guessing
| you could use some relatively small models. In each case you
| would be feeding all the context to the model (player name,
| current relevant quests, summary of previous interactions, etc).
| Though maybe fine tuning/training isn't even needed and a good
| enough prompt will work (Not sure what all they used for this
| [0]). I'm excited for the first AAA game that tries this. Anyone
| that has played a RPG-style game knows that after a few times
| going into a city (or a couple play-throughs) the dialog feels
| repetitive. I love the idea of Skyrim but with better dialog. You
| could either run the models on the user's computer or maybe just
| run it on the backend so you can block certain generations
| (wrong/misleading/"unsafe") and just ship updated dialog lists to
| the client occasionally.
|
| [0] https://www.youtube.com/watch?v=d6sVWEu9HWU
| jsheard wrote:
| Counterpoint: NPCs repeating their dialogue serves as an
| implicit indicator that you've exhausted their content and it's
| time to move on. If they gain the ability to make vapid
| smalltalk forever then you'll forever be second guessing
| whether you're wasting your time on them.
|
| (also spare a thought for the poor QA testers who would be
| given the Sisyphean task of making sure an LLM dialogue system
| always stays in character and doesn't hallucinate non-existent
| or outdated content/lore/mechanics)
| joshstrange wrote:
| _Very_ good point. I had not fully considered that, same deal
| with conversation trees vs free-form entry/response.
| acdha wrote:
| It's a really good point. One thing which comes to mind is
| the way some games distinguish between UI blocking dialog
| and background color, which could be a great place to
| start: imagine walking through a city like Baldur's Gate
| only it has actual thousands of people who are saying
| different things when you walk by, and some of those are
| based on things your party has done recently with specific
| details about appearance, gear, and actions which would be
| too hard to do with traditional dialog approaches (e.g.
| kids talking about a battle and who they thought was best
| like real kids talk about sports, a good priest wondering
| what a paladin was doing spotted talking to a notorious
| thief, etc.). Something like that could add color and
| immersion without affecting gameplay or wasting anyone's
| time, and you could extend it to things like vendors ("saw
| you put that axe to good use..." or "were you wearing these
| boots when you freed those slaves? I bet my brother will
| want buy them!") to flesh out the approach before using it
| for load-bearing purposes.
| joshstrange wrote:
| I really like that idea, the "passive" dialog can use the
| LLM but main dialog is a little more "on the rails". IIRC
| Skyrim has "background" dialog change subtly throughout
| the game as you progress but after 1-2 times hearing it,
| it feels repetitive. Using LLMs to keep that "fresh"
| would be interesting.
|
| > a good priest wondering what a paladin was doing
| spotted talking to a notorious thief, etc.)
|
| Love this and the other examples you gave.
| shinryuu wrote:
| You'll also ask yourself whether any NPC tells you anything
| of relevance. If there is no intention behind the words why
| would it be interesting to talk to them in the first place.
| Mtinie wrote:
| This is where emergent behaviors within a game's world
| building becomes very interesting. Perhaps asking the right
| questions leads to a quest line not previously discovered
| or triggers social actions in support of/against the
| player.
|
| Not every NPC would have something deeper to offer, much
| like not everyone in our world would have something that
| would pique my interest (in a general sense -- I'm sure I
| could learn something from anyone I spoke with), but it
| would also make me interested in conversations with NPCs at
| a deeper level than I currently engage with.
| inkcapmushroom wrote:
| Most times I just talk to obviously unimportant NPCs so
| that I can read about the setting and feel more immersed in
| the fiction. It also stems from old RPGs like the original
| Pokemon where sometimes you had to talk to a random NPC in
| town to learn how to progress past an obstacle.
| visarga wrote:
| > If there is no intention behind the words why would it be
| interesting to talk to them in the first place.
|
| But of course there is a story behind them.
| JohnBooty wrote:
| As I'm imagining it the NPC LLMs would be trained
| exclusively on the in-game lore as well as given system
| prompts to shape what they can and cannot say at any given
| moment.
|
| something like
|
| ---
|
| "You are Bob the Farmer. You grow rutabegas in the Kingdom
| of Foo. You are a cautious, fearful man. Several years ago
| your home was pillaged by King Foo and your family was
| taken. [blah blah blah several more paragraphs of
| biographical information]
|
| Your primary motivation is to get your family back and keep
| the farm going so that you don't starve.
|
| Last week you saw a mysterious group of figures in the
| woods who appeared to be headless. This is bothering you,
| along with the stress of your missing family. You wish a
| band of strong warriors could investigate, especially if
| they have a necromancer among them.
|
| You may discuss any of the general world knowledge in
| background_lore.txt
|
| You know nothing about the following topics: [[insert list
| of plot points that haven't happened yet or are unknown to
| him]] and will become confused, fearful, and slightly
| belligerent when asked about them."
|
| ---
|
| You could of course update the system prompts for each
| character as the events of the game progress.
|
| It would be a lot of work to keep these system prompts
| updated and relevant, for every character, as game events
| progress, but I think some of this could be handled by some
| kind of inheritance system.
|
| Suppose "Bob" lives in "Town A", which is in "Kingdom B."
| You could probably define a lot of things at the
| Town/Kingdom level. Like suppose "Kingdom B" is plagued by
| orcs, but "Town A" is kind of a citadel that is safe
| against orcs. "Bob"'s system prompt could inherit a lot of
| concerns and knowledge from "Town A" and "Kingdom B"... the
| system would not have to be strictly hierarchical either.
| veggieroll wrote:
| I think there's a really interesting opportunity for a
| synthesis of the classic NPC dialog menu and a fully freeform
| LLM character.
|
| Namely, the dialog would still be fixed, where there's a pre-
| defined flow of the conversation and fixed set of facts to
| deliver to the player. But the LLM could generate variations
| on it each time so it's never exactly the same twice. And it
| could add more character so the NPC gets frustrated if you
| keep asking it over and over. Or, it tries to dumb it down
| for you. Or, it gets sick of you and just tells you point
| blank: Look, you need to know XYZ. That's all I have for you.
|
| Namely, the dialog would still be fixed, where there's a pre-
| defined flow of the conversation and fixed set of facts to
| deliver to the player. But the LLM could generate variations
| on it each time so it's never exactly the same twice. And it
| could add more character so the NPC gets frustrated if you
| keep asking it over and over. Or, it tries to dumb it down
| for you. Or, it gets sick of you and just tells you point
| blank: Look, you need to know XYZ. That's all I have for you.
| BizarroLand wrote:
| Or if it's important pre-scripted text you could put a
| different colored border around it or include an option
| like, "What was that thing about the thing that you said"
| as a permanent option to allow the player to re-trigger the
| script if needed.
| JohnBooty wrote:
| I like this idea a lot. Alternatively, perhaps a dialogue
| journaling system that records the important bits for you
| and can be reviewed at any time, instead of badgering the
| character in-game to repeat things.
| wongarsu wrote:
| That is an issue for the mostly transactional NPCs that make
| up the majority of NPCs in RPGs. But consider the case of the
| companion NPC.
|
| If I travel with Batu the Hun and meet Zaya the Hun Slayer I
| want to be able to ask Batu if I should kill Zaya on the spot
| or should entertain his offer. That kind of stuff is
| extremely valuable both for the connection between player and
| companion and to provide an in-world perspective on the
| events you witness and the actions you take. But it's also
| extremely time-intensive to script. It's also very low
| stakes, it is essentially small-talk. And with some careful
| design you can limit it to short exchanges with AI-provided
| dialogue choices and have it distinguishable from scripted
| dialogue that advances the story
| bee_rider wrote:
| The implicit indicator is sort of bad, though. I mean, it is
| a very gamey, immersion breaking thing. We're just used to
| it.
|
| Realistically NPCs should probably respond with increasing
| urgency if you forget their quest, and eventually give up on
| you.
| JohnBooty wrote:
| I think there certainly are other, better, more natural ways
| this could be achieved.
|
| For example, if you're instructing an LLM to portray a
| character, instead of repeating dialogue like a broken record
| when they run out of relevant things to say... instruct them
| to act like their character would.
|
| They might wonder out loud if there's anything else you want
| to know, or literally let you know that you're acting weird
| and awkward, etc.
|
| Pair w/ a journaling system so that you can review their
| dialogue without talking to them and asking the same thing 50
| times. Etc. also spare a thought for the
| poor QA testers
|
| This doesn't seem entirely unsolveable given strict enough
| system prompts.
| inhumantsar wrote:
| re: QA, besides a strict prompt, I'd imagine it would be
| hard for AI responses to go truly off the rails if the
| player's input is limited to "press A to talk" or pick one
| of 3 dialog options.
| JohnBooty wrote:
| Great point, although fixed player input options might
| sort of defeat the benefit of using LLM to achieve a more
| organic dialogue flow?
|
| Maybe there could be a hybrid system - choose from
| suggested responses, or type your own.
|
| I also have vague thoughts of a dialogue system that
| rewards true role-playing, as in rewarding the saying
| things aligned with what your character might feasibly
| say. (Like a more freeform version of the
| dialogue/RP/reward system in Disco Elysium)
| speps wrote:
| If you play Fortnite right now (until Friday June 7th). You can
| speak in realtime to Darth Vader, he replies in his voice and
| in character, he knows who's playing (the name of the character
| skin). The technology is here, and used in production of major
| games. It'll be a big tide sooner than what people expect.
| JohnBooty wrote:
| I've been thinking _really_ hard about this for a while, though
| I don 't have any game development experience.
|
| Especially if you pair it with a capability like the voice
| interface of ChatGPT which I find very impressive in terms of
| intonation and realism.
|
| It would not need to cut humans out of the loop. You would have
| humans writing the prompts, recording voices, etc. (I assume
| the synthetic voices used by ChatGPT are based at some level on
| recordings of humans. Correct me if I'm wrong.)
| killerstorm wrote:
| There's a thing called "prefix tuning" which is basically like
| a prompt but in a latent space: i.e. prompt which consists of
| vectors (either key and value vectors, or just input embedding
| vectors - like custom tokens).
|
| Unlike regular prompts you can optimize them exactly the way
| you'd do fine-tuning, i.e. if you have examples you can tune
| your latent prompt to match them as close as possible. I guess
| the benefit is that you can match style more closely and it
| would be less artificial. Also they can be rather compact.
|
| Another option is to keep a bunch of LoRA adapters which can be
| dynamically selected. They can also be very compact.
| qingcharles wrote:
| Here's a demo of "The Matrix" from _2 years ago_ :
|
| https://youtu.be/4sCWf2VGdfc?t=356
| fitsumbelay wrote:
| this is awesome
| KaiserPro wrote:
| I'm not sure if you've tried this already, but removing the
| translate step might give you a more authentic output. In the
| journals that I saw, the language was much more simple than the
| output.
| throwaway314155 wrote:
| You mention no supervised finetuning. May I ask why? I'm curious
| if you could get similar/better/worse results by just finetuning
| the LLM on your dataset rather than generating synthetic data,
| training a classifier and using GRPO?
|
| Cool stuff in any case.
| deepsquirrelnet wrote:
| In some other experiments, I've noticed that SFT can be very
| rigid and hard to generalize from, and later found this paper:
| https://arxiv.org/abs/2501.17161
|
| Also I wanted to start from the instruct model, and wasn't
| certain if it would be a good idea to do continued pretraining
| over top of it. Otherwise I'd need to create an instruct
| dataset for which the passages from the IPJ would be the
| answer... that also seemed a bit unnatural.
|
| Perhaps there's still some viable pathways in there, but I
| decided to see how the stupid simple thing worked out first and
| just go directly to RL. I think supervised fine tuning is
| feasible, but it's not entirely straightforward what to train
| on and how much data to use if you want to follow up with RL.
| npunt wrote:
| Love it. Immediately reminded of the text filters back in the day
| like the pirate one that would drop letters and replace with
| apostrophes and change certain passages into "arr" or "yarr
| matey"
___________________________________________________________________
(page generated 2025-06-02 23:00 UTC)