[HN Gopher] Domain Adaptation of Base Models + ShadowdarkQA Bench
       ___________________________________________________________________
        
       Domain Adaptation of Base Models + ShadowdarkQA Bench
        
       Author : pact_inference
       Score  : 16 points
       Date   : 2025-05-29 13:59 UTC (9 hours ago)
        
 (HTM) web link (gygaxtest.com)
 (TXT) w3m dump (gygaxtest.com)
        
       | palmfacehn wrote:
       | Isn't this a use case for a RAG?
        
         | pact_inference wrote:
         | definitely! However, my intuition is that correctly
         | interpreting the rules pulled in context will require some
         | basic understanding of the game system that pretraining would
         | help with. Ultimately after training this base model for
         | instruction-tuning and tool-use (to provide a search tool) I'll
         | compare it against https://huggingface.co/Qwen/Qwen3-0.6B
         | without any specific domain pretraining and see how it performs
         | at rule adjudication. I expect the shadowdark-trained model
         | will have better understanding of the rules, but there's only
         | one way to find out.
        
           | palmfacehn wrote:
           | It is an interesting problem to solve. When reading, I
           | noticed the model's ambiguity around terms like 4d6. At first
           | I thought you might try editing your markup to describe the
           | concept of dice more thoroughly. Ultimately I wonder if you
           | might try having the model fill in data to be utilized by a
           | hard coded combat system. Are you going to rely on the LLM
           | for pseudorandom numbers? Concepts like turns and dice rolls
           | could be abstractly defined in code and instantiated by the
           | model.
           | 
           | The model might excel at creating character sheets, after you
           | define a schema. From there you can validate the generated
           | sheets against known lore. You could combine the story
           | telling from the LLM with the formalized character schema to
           | create campaigns. I'm not an expert here, but I suspect you
           | might try asking the model to translate an existing fantasy
           | story dataset into a series of narration/dialogue blocks and
           | character sheets.
           | 
           | Without training, I've experimented with similar approaches
           | for item generation using EBNF.
        
             | pact_inference wrote:
             | > Are you going to rely on the LLM for pseudorandom
             | numbers?
             | 
             | Definitely! I'm going to start with instruction tuning it
             | for basic question answering, and then add tools to allow
             | it to search the markdown source to cite answers to rules
             | questions. I think adding some dice tooling for proper
             | character sheet creation would be an awesome task to test
             | as well. I'm actually thinking a lot about what tasks I
             | could try that are "trivially" programmatically verifiable
             | in their correctness for stuff like GRPO, so I'm definitely
             | going to use that idea.
             | 
             | > You could combine the story telling from the LLM with the
             | formalized character schema to create campaigns. I'm not an
             | expert here, but I suspect you might try asking the model
             | to translate an existing fantasy story dataset into a
             | series of narration/dialogue blocks and character sheets.
             | 
             | I think probably late this year I'll be able to work on
             | that sort of thing. There's a really interesting approach
             | to story generation https://arxiv.org/abs/2503.22828 here,
             | but modifying ways to translate it into campaign relevant
             | structured objects and "reward" that will take some
             | experimentation.
        
       | jasonjmcghee wrote:
       | > I used the AdamW optimizer and selected a learning rate of
       | 5e-5. I've seen learning rates of 5e-6 for pretraining and 5e-5
       | for finetuning. I would consider this closer to the latter - I
       | don't want to totally destroy the knowledge Qwen already had, I
       | just want to add to it a bit.
       | 
       | Is this a typo? Maybe 5e-4 for pretraining?
       | 
       | Otherwise this goes against all the intuition I have around
       | learning rates and catastrophic forgetting. (a smaller learning
       | rate causing knowledge degredation)
        
         | pact_inference wrote:
         | whoops, definitely a typo! It should be 5e-4 for as the base
         | "pretraining" LR, you're absolutely correct.
         | 
         | your intuition is sound, but my fingers are not.
        
       ___________________________________________________________________
       (page generated 2025-05-29 23:01 UTC)