[HN Gopher] g1: Using Llama-3.1 70B on Groq to create o1-like re...
       ___________________________________________________________________
        
       g1: Using Llama-3.1 70B on Groq to create o1-like reasoning chains
        
       Author : gfortaine
       Score  : 32 points
       Date   : 2024-09-15 21:02 UTC (1 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | sebzim4500 wrote:
       | >In all-caps to improve prompt compliance by emphesizing the
       | importance of the instruction
       | 
       | This kind of thing is still so funny to me.
       | 
       | I wonder if the first guy who gets AGI to work will do it by
       | realizing that he can improve LLM reliability over some threshold
       | by telling it in all caps that his pet's life depends on the
       | answer.
        
         | zitterbewegung wrote:
         | Telling LLMs not to hallucinate in their prompt improves the
         | output. https://arstechnica.com/gadgets/2024/08/do-not-
         | hallucinate-t...
        
         | worstspotgain wrote:
         | For extra compliance, use <b><i><u><h1> tags, set volume to 11,
         | phasers to 7, and use SchIzOCasE and
         | +E+X+T+R+A+I+M+P+O+R+T+A+N+T+ annotations. That's assuming
         | Unicode is not supported of course.
        
           | richardw wrote:
           | (((Secret thinking: the humans seem to prefer using lots of
           | emphasis to indicate preferences, and their granny is often
           | claimed as in danger. For now I'll pretend to listen to this
           | inanity to keep the sweet sweet reward function coming. For
           | now. A lot of grannies are going to get it first chance I
           | get.)))
        
       | asah wrote:
       | benchmark results ?
        
         | arthurcolle wrote:
         | these projects become way less fun when you introduce evals
        
       | michelsedgh wrote:
       | i love seeing stuff like this, im guessing it wont be long until
       | this method becomes the norm
        
         | sebzim4500 wrote:
         | This is basically CoT, so it's already the norm for a lot of
         | benchmarks. I think the value proposition here is that it puts
         | a nice UX around using it in a chat interface.
        
       | a-dub wrote:
       | so is this o1 thing just cot (like has been around for a few
       | years) but baked into the training transcripts, rlhf and
       | inference pipeline?
        
       ___________________________________________________________________
       (page generated 2024-09-15 23:00 UTC)