[HN Gopher] Domain Adaptation of Base Models + ShadowdarkQA Bench
___________________________________________________________________
Domain Adaptation of Base Models + ShadowdarkQA Bench
Author : pact_inference
Score : 16 points
Date : 2025-05-29 13:59 UTC (9 hours ago)
(HTM) web link (gygaxtest.com)
(TXT) w3m dump (gygaxtest.com)
| palmfacehn wrote:
| Isn't this a use case for a RAG?
| pact_inference wrote:
| definitely! However, my intuition is that correctly
| interpreting the rules pulled in context will require some
| basic understanding of the game system that pretraining would
| help with. Ultimately after training this base model for
| instruction-tuning and tool-use (to provide a search tool) I'll
| compare it against https://huggingface.co/Qwen/Qwen3-0.6B
| without any specific domain pretraining and see how it performs
| at rule adjudication. I expect the shadowdark-trained model
| will have better understanding of the rules, but there's only
| one way to find out.
| palmfacehn wrote:
| It is an interesting problem to solve. When reading, I
| noticed the model's ambiguity around terms like 4d6. At first
| I thought you might try editing your markup to describe the
| concept of dice more thoroughly. Ultimately I wonder if you
| might try having the model fill in data to be utilized by a
| hard coded combat system. Are you going to rely on the LLM
| for pseudorandom numbers? Concepts like turns and dice rolls
| could be abstractly defined in code and instantiated by the
| model.
|
| The model might excel at creating character sheets, after you
| define a schema. From there you can validate the generated
| sheets against known lore. You could combine the story
| telling from the LLM with the formalized character schema to
| create campaigns. I'm not an expert here, but I suspect you
| might try asking the model to translate an existing fantasy
| story dataset into a series of narration/dialogue blocks and
| character sheets.
|
| Without training, I've experimented with similar approaches
| for item generation using EBNF.
| pact_inference wrote:
| > Are you going to rely on the LLM for pseudorandom
| numbers?
|
| Definitely! I'm going to start with instruction tuning it
| for basic question answering, and then add tools to allow
| it to search the markdown source to cite answers to rules
| questions. I think adding some dice tooling for proper
| character sheet creation would be an awesome task to test
| as well. I'm actually thinking a lot about what tasks I
| could try that are "trivially" programmatically verifiable
| in their correctness for stuff like GRPO, so I'm definitely
| going to use that idea.
|
| > You could combine the story telling from the LLM with the
| formalized character schema to create campaigns. I'm not an
| expert here, but I suspect you might try asking the model
| to translate an existing fantasy story dataset into a
| series of narration/dialogue blocks and character sheets.
|
| I think probably late this year I'll be able to work on
| that sort of thing. There's a really interesting approach
| to story generation https://arxiv.org/abs/2503.22828 here,
| but modifying ways to translate it into campaign relevant
| structured objects and "reward" that will take some
| experimentation.
| jasonjmcghee wrote:
| > I used the AdamW optimizer and selected a learning rate of
| 5e-5. I've seen learning rates of 5e-6 for pretraining and 5e-5
| for finetuning. I would consider this closer to the latter - I
| don't want to totally destroy the knowledge Qwen already had, I
| just want to add to it a bit.
|
| Is this a typo? Maybe 5e-4 for pretraining?
|
| Otherwise this goes against all the intuition I have around
| learning rates and catastrophic forgetting. (a smaller learning
| rate causing knowledge degredation)
| pact_inference wrote:
| whoops, definitely a typo! It should be 5e-4 for as the base
| "pretraining" LR, you're absolutely correct.
|
| your intuition is sound, but my fingers are not.
___________________________________________________________________
(page generated 2025-05-29 23:01 UTC)