hngopher.com

       [HN Gopher] Longwriter - Increase llama3.1 output to 10k words
       ___________________________________________________________________
        
       Longwriter - Increase llama3.1 output to 10k words
        
       Author : taikon
       Score  : 88 points
       Date   : 2024-10-07 14:05 UTC (8 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | alwinaugustin wrote:
       | How to use this with the local ollma setup
        
       | danng87 wrote:
       | Interesting project!
       | 
       | Does anyone know how LongWriter handles maintaining coherence and
       | structure in longer outputs? Also, are there specific strategies
       | or parameters recommended for fine-tuning LLaMA 3.1 with this
       | setup to maximize the quality of generated text?
        
         | yawnxyz wrote:
         | How do people eval these very long outputs?
         | 
         | I've never figured that out (and no I can't just... read all of
         | them)
        
           | Multicomp wrote:
           | I don't know how to answer your question. But. I will say
           | that I could see a future where one has a brainstormed
           | setting / plot outline / concept and one could have the LLM
           | output a first draft of whatever length, then make changes /
           | tweaks to the story / copy over time.
           | 
           | The hardest part of writing for me is the first draft.
           | Editing an existing copy to my own human artistic vision is
           | much easier. No, this character doesn't act like this, he
           | acts like that.
           | 
           | Presuming you don't have an allergic reaction to AI affected
           | writing copy (even though the publishing houses are going to
           | outsource their copyedits and style guide edits to LLMs, that
           | is not hard to predict), an author could have the copy start
           | with the souless and then hand edit until they like it from
           | there.
           | 
           | Then it makes the copy go into hybrid world where AI was used
           | to be a power tool, not the entire product. Copyright law may
           | frustrate that for a time where if say over 5% of the final
           | copy is AI-generated, it is ineligible for copyright
           | protections, but otherwise, there will be stories and the
           | best stories win.
           | 
           | 1. Hand crafted on a fountain pen through all the edits,
           | digitized to an opendoc (ok who are we kidding, .docx but I
           | can dream for open file formats)
           | 
           | 2. This story was started and is digital native through
           | scrivener / yWriter and eventually dumped to a .docx
           | 
           | 3. This story started in an LLM chat response and edited
           | muchly to match the artist's human vision
           | 
           | All 3 stories will exist. and there will be a sea of slop
           | that used (3) and then barely edited a thing, hoping to sell
           | a book by SEO tag manipulation and an 'eye-catching'/lurid
           | cover, just as there is now with (2) hastily thrown together
           | rip-offs of others text.
           | 
           | But you can believe that I will be glad to go all Star Trek
           | Holodeck on my idea concepts for books and tabletop
           | campaigns.
           | 
           | Computer, give me a questline for a faction called the Silver
           | Carders, there's a catfolk named Marvin who is the adopted
           | son of a human named Doug Alvaro and he is the old flame of
           | the founder of the faction and there's political intrigue
           | that X Y and Z, please find a good mix-in for these 4-7 TV
           | tropes links I like to play with, go.
           | 
           | Ok now swap out the absentminded professor gadgeteer with a
           | cloud cuckoolander grandma mechanic.
           | 
           | Ok now find me a few entrypoints to this faction for my party
           | characters who are currently A, B, and C.
           | 
           | Oh yeah, the max context this stuff will be useful for will
           | be great.
           | 
           | Can I do that now with manual digital tools? Of course. But
           | this lessens the activation energy/boilerplate of typing this
           | stuff up a lot.
           | 
           | Will long-term it make future generations unable to cope
           | without the tool? Yes. Just like I cannot use a slide rule or
           | do any geometry outside of my class, I have computer tools
           | for that. LLMs will be a tool that after 20 years will be
           | normalized enough.
           | 
           | Granted it will be odd when we have 3-book series come out
           | covering a recent current events that captures the public's
           | imagination within weeks of the event, instead of the
           | 3-years-later that usual entertainment media like books and
           | movies take today.
           | 
           | Or odd when people can pay to have their own version of the
           | story made, either inserting characters or 'what if'ing the
           | story where they can pay to alter a single plot point and see
           | how the characters react and how that modifies the overall
           | story.
           | 
           | We will all be more literarily conversant whether we want to
           | or not, and I'm not sure whether I like that or I'm annoyed
           | by it yet. Too soon to tell.
        
             | yawnxyz wrote:
             | I think some abstraction will need to occur, or it's just
             | too much information for us to ever take in and hold all at
             | once... I think this goes past my problem of "I can't eval
             | long outputs" and your quest of pick-and-edit style. Code
             | assistants are in the same boat right now too.
             | 
             | It looks like all these knowledge fields are converging
             | into the same problem
        
       | ed wrote:
       | Paper: https://arxiv.org/abs/2408.07055
       | 
       | The model is stock llama, fine tuned with a set of long documents
       | to encourage longer outputs.
       | 
       | Most of the action seems to happen in an agent.
        
       | vessenes wrote:
       | The sample output is interesting - it has highly suggestive
       | chapter titles which read like pretty normal story beats. It
       | seems like it's guiding itself on these, then able to chunk out
       | longer form writing per chapter.
       | 
       | For what it's worth, the writing is .. bland. In the way that
       | only an LLMs writing can be -- relatively grammatically sound,
       | and totally soulless. I will never think of the love story of
       | Elizabeth and Thomas again, despite having read the entire thing.
       | 
       | In early days of GPT-3, I experimented a lot with getting it
       | respond _as_ certain authors, and it was really quite excellent
       | at that. This is one of the many things that seem likely to have
       | been nerfed over time, I 'd guess partly because human preference
       | training just asks for bland responses, and partly because the
       | injected prompts from OpenAI strongly discourage doing things
       | related to real people, and those preferences are carried
       | through, subtlely or not, into the augmented training data most
       | open models tune on.
        
         | elfelf12 wrote:
         | Is it a copyright problem or a capitalist problem or why do we
         | only get nerfed dumb chatbots?
         | 
         | Would be interesting to really try hard and create a llm that
         | can write novels in the style of an author. And skip the chat
         | functionality!
        
           | sReinwald wrote:
           | Perhaps both. But I wonder if the incredible blandness of
           | most chatbots is effectively just a regression towards the
           | mean.
           | 
           | Most AI companies try to train their bots on vast amounts of
           | different data, and I suspect it's very difficult for that to
           | result in very creative writing when you're training on works
           | of fiction, as well as cooking recipes, Reddit comments and
           | technical documentation.
        
       | mmaunder wrote:
       | What the difference between this and using chat history to
       | concatenate outputs and prompting with something like "Now write
       | the next section" repeatedly? I've done that with NotebookLM and
       | it'll write a complete fictional story based on sources, for
       | example.
        
         | LeoPanthera wrote:
         | Most LLMs are trained to write "complete" outputs. So each
         | section will end up being like a tiny self-contained short
         | book. Without manual editing they will not create long
         | narratives.
        
         | dotnet00 wrote:
         | In my testing, that often causes the model to 'drift' and
         | ramble wildly compared to just getting one long output from the
         | very start.
         | 
         | The issue is probably that when you split it by just asking for
         | the next section, you're asking it to figure out how to
         | continue from a block that wasn't written with the awareness
         | that it'd have to add on to it.
         | 
         | From the diagram on the repo, I guess this first plans out the
         | structure for each block, and generates the blocks based on the
         | plan.
        
         | thomasahle wrote:
         | It would be the same if the model was "raw", trained only on
         | text completion. But all models these days are RLHF'ed on
         | (prompt, answer) pairs, so unfortunately they can get confused
         | if the prompt already contains part of an answer.
        
           | elfelf12 wrote:
           | I think base models are far superior to those boring instruct
           | tuned models. I would rather have a good text completionist
           | than a chat bot. But as far as i know i am in a minority
           | there.
        
       | wkat4242 wrote:
       | Really interesting. I wonder if you can do this on ollama too.
        
       ___________________________________________________________________
       (page generated 2024-10-07 23:00 UTC)